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Improved Isoprenoid Production 

The present invention relates to novel polynucleotides and polypeptide sequences useful in 
the isoprenoid biosynthetic pathway. More particularly, the present invention provides re- 
cornbinantly produced cells that exhibit improved production of zeaxanthin. Methods of 
5 making and using such cell lines are also provided. 

Carotenoids are commercially important C-40 isoprenoid compounds used as nutritional 
supplements, pharmaceuticals and food colorants for humans and as pigments for animal 
feed. Currently industrially important carotenoids are produced mainly by chemical syn- 
thesis (P-carotene, canthaxanthin and astaxanthin) or extraction from natural sources ^ 

10 (lutein from marigold, capsanthin from paprika). Production of carotenoids, however, 
using microorganisms has been achieved in some cases. For example, (3-carotene is pro- 
duced by fermentation with the fungus Blakeslea trispora (US 5,328,845) or by pond cul- 
ture using the halotolerant alga Dunaliella salina [Borowitzka, J. Biotechnol. 70:313-321 
(1999)]. Lycopene production has also been reported in B. trispora (WO 00/77234). 

15 Astaxanthin is produced by fermentation using yeast (Phaffia rhodozyma, (recently re- 
named Xanthophyllomyces dendorous)) (US 6,015,684) or in photobioreactors or open 
ponds using the alga Haematococcus pluvialis [Lorenz and Cysewski, Trends Biotechnol. 
18:160-167 (1999); Olaizola, J. Appl. Phycol. 12:499-506 (2000)]. Such microbial pro- 
duction systems, however, do not produce carotenoids in amounts sufficient for economi- 

20 cal industrial scale production. ^ 

In the mid-1960's, scientists at Hoffmann-La Roche isolated several marine bacteria that 
produced the yellow carotenoid zeaxanthin, which has application in poultry pigmentation 
and in the prevention of age-related macular degeneration in humans. One bacterium, 
which showed promising levels of zeaxanthin production, was given the strain designation 

25 R-1512, and it was deposited at the American Type Culture Collection (ATCC, Manassas, 
VA, USA) as strain ATCC 21588 (US 3,891,504). Using the accepted taxonomic standards 
of that time (classification performed by the Eidgenossische Technische Hochschule 
(Zurich) and the National Collection of Industrial Bacteria, Torry Research Station 
(Aberdeen, Scotland)), the zeaxanthin-producing organism was classified as a member of 

30 the genus Flavobacterium, but no species designation was assigned. 
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An extensive mutagenesis and screening program was subsequently conducted to isolate 
mutants of R- 1512 with higher zeaxanthin productivities. With respect to the presently 
described work, two such mutants are significant. These mutants, listed in order of their 
zeaxanthin productivities, are R1534 and Rl 14. A variety of other mutants have been used 
5 over the years for biochemical studies of carotenoid biosynthesis [Goodwin, Biochem. Soc. 
Symp. 35:233-244(1972); McDermott et al., Biochem. J. 134:1115-1117(1973); Brittonet 
al., Arch. Microbiol. 1 13:33-37 (1977); Mohanty et al., Helvetica Chimica Acta 83:2036- 
2053 (2000)]. 

The early attempts to develop a commercially viable fermentation process for the produc- 
10 tion of zeaxanthin using classically derived mutants of strain R-1512 were not successful. 
However, with the advent of molecular biology, the possibility arose that higher zeaxan- 
thin-producing strains could be developed. The first step in this direction was taken with 
the cloning and sequencing of the carotenoid gene cluster from strain R1534 
(US 6,087,152), which is hereby incorporated by reference as if recited in full herein). 
15 US 6,087,152 discloses that the carotenoid genes were functionally expressed in Escherichia 
coli and Bacillus suhtilis resulting in zeaxanthin production in these hosts. US 6,087,152 
also disclosed that by modifying the carotenoid gene cluster or by adding a gene from an 
astaxanthin producing bacterium, it was possible to produce carotenoids other than zea- 
xanthin (EP 872,554). Moreover, EP 872,554 disclosed that carotenoid production was 
20 increased in strain R1534 by introducing cloned carotenoid gene clusters on a multi-copy 
plasmid. 

Despite the enormous structural diversity in isoprenoid compounds, all are biosynthesized 
from a common C-5 precursor, isopentenyl pyrophosphate (IPP). Up until the early 
1990's it was generally accepted that IPP was synthesized in all organisms via the mevalon- 
25 ate pathway, even though some experimental results were not consistent with this biogenic 
scheme [Eisenreich et al., Chemistry and Biology 5:R221-R233 (1998)]. 

Mevalonate pathway: 



Acetyl-CoA acetyltransferase 
(thiolase) (atoB, phaA) 



2 x Acetyl- Co A 




*~ Acetoacetyl-CoA 



CoA-SH 



HMG-CoA synthase (hcs) 




H 2 0 Acetyl-SCoA 



CoA-SH 
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HMG-CoA reductase (mvaA) 
HMG-CoA ■?> — <c *~ Mevalonate 



1DU 1 



2 NADPH 
+ 2H + 2 NADP* 



Mevalonate-kinase (Anv/c) 
Mevalonate ~? — Mevalonate-5-phosphate 



ADP 



Phosphomevalonate kinase (pm/c) 
Mevalonate-5-phosphate ^ ^ Mevalonate-5-diphosphate 

ATP adp r 

Diphosphomevalonate 
decarboxylase (mvd) Isopentenyl 
Mevalonate-5-diphosphate ^ ^ * diphosphate (IPP) 

ATP T Jf 

ADP + Pj C °2 

5 The discrepancies have since been reconciled by the discovery of an alternate pathway of 
IPP biosynthesis, the deoxyxylulose (DXP) pathway (Note: The alternate pathway of IPP 
biosynthesis has been referred to by various names in the scientific literature (DXP path- 
way, DOXP pathway, MEP pathway, GAP/pyruvate pathway and the non-mevalonate 
pathway). We use the name DXP pathway here only for the sake of simplicity). The first 
10 five reactions of the DXP pathway have been identified [Herz et ah, Proc. Nat. Acad. Sci. 
97:2486-2490 (2000)], but the subsequent steps leading to formation of IPP have not yet 
been elucidated. 



DXP pathway: 



1 -deoxy-D-xylulose-5-phosphate 
pyruvate synthase (dxs) 1-deoxy-D-xylulose- 



D-glyceraldehyde- 
3-phosphate 




5-phosphate 



15 



1 -deoxy-D-xyluIose-5-phosphate 
1-deoxy-D-xylulose- reductoisomerase (dxi) 2C-methyl-D-erythritol- 
5-phosphate ^ "ST *~ 4-phosphate 




NADPH 

NADP* 
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4-diphosphocytidyl-2C-methyl- 
D-erythritol synthase {ygbB) 

2C-methyl-D-erythritol- — *~ 4-diphosphocytidyl- 

4-phosphate f \ 2C-methyi-D-erythritoI 



CTP 

PPi 



4-diphosphocytidyl-2C-methyl- 
4-diphosphocytidyl- D-erythritol kinase (ychB) > 4-diphosphocytidyl-2C-methyl- 
2Omethyl-D-erythrit0l *N D-erythritol 2-phosphate 

ATP f 
ADP 



2C-methyl-D-erythritol-2,4-cyclo- 

4-diphosphocytidyl- diphosphate synthase (ygtoS) 2C-methyl-D-ery- 

2C-methy)-D-ery- thritol-2,4-cyclo- — — *-IPP 

thrito!-2-phosphate X diphosphate 

CMP 



McDermott et al. (supra) and Britton et al. [J. Chem. Soc. Chem. Comm. p. 27 (1979)] 
5 showed that crude extracts of zeaxanthin producing mutant strains derived from the origi- 
nal Roche isolates incorporated labeled mevalonate into zeaxanthin. While there was no 
reason to question this evidence for IPP biosynthesis via the mevalonate pathway, the work 

-r- 

was done prior to the discovery of the DXP pathway, and it has been reported that some 
bacteria (Streptomyces species) possess both pathways for IPP synthesis and that expression 
10 of these pathways is temporally regulated [Seto et al., Tetrahedron Lett. 37:7979-7982 

(1996); Dairi et al., Mol. Gen. Genet. 262:957-964 (2000)]. In addition, at present, only a 
small number of eubacteria have been shown to possess the mevalonate pathway for IPP 
~" synthesis. The genes encoding the enzymes of the mevalonate pathway have been cloned 

and sequenced from some of these bacteria [Wilding et al., J. Bacteriol. 182:4319-4327 
15 (2000); Takagi et al., J. Bacteriol. 182:4153-4157 (2000)]. 

Several examples exist where the application of metabolic engineering has succeeded in 
altering or improving carotenoid production in microorganisms [Lagarde et al., Appl. Env. 
Microbiol. 66:64-72 (2000); Wang et al., Biotechnol. Bioeng. 62:235-241 (1999); Wang et 
al., Biotechnol. Prog. 16:922-926 (2000) (and references therein); Sandmann et al., Trends 
20 Biotechnol. 17:233-237 (2000); Misawa and Shimada, J. Biotechnol. 59:169-181 (1998); 
Matthews and Wurtzel, Appl. Microbiol. Biotechnol. 53:396-400 (2000); Albrecht et al., 
Nature Biotechnol. 18:843-846 (2000); Schmidt-Dannert et al., Nature Biotechnol. 18:750- 
753 (2000)]. For example, E. coli, a non-carotenogenic bacterium, can be engineered to 
produce carotenoids by introducing the cloned carotenoid (erf) genes from the bacteria 
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Agrobacterium aurantiacum y Erwinia herbicola or Erwinia uredovora (Misawa and Shimada, 
supra). Harker and Bramley [FEBS Lett. 448:1 15-1 19 (1999)] and Matthews and Wurtzel 
(supra) disclosed that carotenoid production in such engineered £ coli strains could be 
increased by over- expressing the gene coding for 1-deoxy-D-xylulose 5-phosphate 
5 synthase (DXPS), the first enzyme in the DXP pathway (£ coli possesses only the DXP 
pathway for isoprenoid biosynthesis and does not use the mevalonate pathway [Lange et 
al., Proc. Nat. Acad. Sci. 97:13172-13177 (2000)]. Harker and Bramley (supra) also dis- 
closed an increase in the isoprenoid compound ubiquinone-8, in the cells overproducing 
DXPS. These results supported the hypothesis that limited availability of IPP, resulting 

10 from insufficient in vivo activity of DXPS, was limiting the production of carotenoids and 
other isoprenoid compounds in the engineered strains. Using a similar E. coli system, Kim 
and Keasling [Biotechnol. Bioeng. 72:408-415 (2001)] disclosed that the combined over- ( 
expression of the genes encoding DXPS and the second enzyme of the DXP pathway, DXP 
reductoisomerase (l-deoxy-D-xylulose-5-phosphate reductoisomerase) gave higher caro- 

15 tenoid production than over-expression of just the gene encoding DXPS. 

All of these studies were done in E. coli engineered to produce carotenoids. Accordingly, 
one disadvantage to these studies was that the amount of carotenoids produced by these 
recombinant E. coli strains were very low compared to the amounts produced by even 
non-recombinant microorganisms used for industrial production of carotenoids. Further- 
20 more, improved carotenoid production in bacteria by genetic engineering of the IPP bio- 
synthetic pathway has only been shown in organisms that utilize the DXP pathway for IPP 
formation. No similar studies have been reported for bacteria that produce IPP via the 
mevalonate pathway. 

\ 

s 

Metabolic engineering of the mevalonate pathway to improve production of isoprenoid 
25 compounds has been reported in yeast. For example, WO 00/01649 disclosed that produc- 
tion of isoprenoid compounds is increased in Saccharomyces cerevisiae when the gene 
coding for 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMG-CoA reductase) is 
over -expressed. However, it has not been shown that this strategy improves isoprenoid 
production in bacteria, and in particular, it has not been shown that carotenoid 
30 production in bacteria can be improved by amplifying expression of mevalonate pathway 
genes. While it has been shown that some mevalonate pathway genes from eukaryotes 
[Campos et al., Biochem. J. 353:59-67 (2001)] and from the bacterium Streptomyces sp. 
strain CL190 (Takagi et al., supra) can be expressed in E, coli, no increase in isoprenoid 
production was reported in the strains. 
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In addition to the reactions that form IPP (via the DXP or mevalonate pathways) and the 
reactions that convert farnesyl pyrophosphate (FPP) to various other isoprenoids (e.g., 
carotenoids, quinones) two other reactions are known to be involved in isoprenoid 
biosynthesis. IPP isomerase interconverts IPP and its isomer, dimethylallyl pyrophosphate 
5 (DMAPP). Two forms of IPP isomerase exist, the type 1 enzyme is well known in eukary- 
otes and some bacteria, and the newly identified type 2 enzyme that is FMN- and 
NADP(H)-dependent [Kaneda et al., Proc. Nat. Acad. Sci. 98:932-937 (2001)]. 

Several reports disclose that in E. coli engineered to produce carotenoids, amplification of 
native or heterologous type 1 IPP isomerase (idi) genes stimulates carotenoid production 

10 [Kajiwara et al., Biochem. J. 324:421-426 (1997); Verdoes and van Ooyen, Acta Bot. Gallica 
146:43-53 (1999); Wang et al., supra]. In one report (Wang et al, supra), it was further 
disclosed that over-expression of the ispA gene, encoding FPP synthase (Farnesyl diphos- 
phate synthase) increased carotenoid production in an engineered carotenogenic strain of 
E. coli when combined with over-expression of the idi and artE (GGPP synthase/Geranyl- 

15 geranyl diphosphate synthase) genes. As is the case for the pathway of IPP biosynthesis, 
however, it has not been shown that over-expression of genes coding for IPP isomerase or 
FPP synthase improves carotenoid production in a naturally carotenogenic microorga- 
nism. Also, the levels of carotenoids produced in the E. coli strains described above are 
very low, and it has not been shown that these strategies work in an industrial microorga- 

20 nism where carotenoid production was already high. 

In sum, there is no prior evidence that increased expression of gene(s) coding for enzymes 
of the mevalonate pathway can improve production of carotenoids in naturally caroteno- 
genic bacteria or in naturally non-carotenogenic bacteria engineered to be carotenogenic. 

One embodiment of the present invention is an isolated polypeptide that includes an 
25 amino acid sequence selected from the following group: (a) an amino acid sequence shown 
as residues 1 to 340 of SEQ ID NO:43; (b) an amino acid sequence shown as residues 1 to 
349 of SEQ ID NO:45; (c) an amino acid sequence shown as residues 1 to 388 of SEQ ID 
NO:47; (d) an amino acid sequence shown as residues 1 to 378 of SEQ ID NO:49; (e) an 
amino acid sequence shown as residues 1 to 305 of SEQ ID NO:51; (0 an amino acid 
30 sequence shown as residues 1 to 332 of SEQ ID NO:53; (g) a fragment of an amino acid 
sequence selected from the group consisting of SEQ ID NOs: 43, 45, 47, 49, 51, and 53, 
wherein said fragment has at least 30 contiguous amino acid residues; (h) an amino acid 
sequence of a fragment of a polypeptide selected from the group consisting of SEQ ID 
NOs: 43, 45, 47, 49, 51, and 53, the fragment having the activity of HMG-CoA reductase; 
35 isopentenyl diphosphate isomerase, hydroxymethylglutaryl-CoA synthase (HMG-CoA 
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synthase), mevalonate kinase, phosphomevalonate kinase, or diphosphomevalonate de- 
carboxylase; (i) an amino acid sequence of a polypeptide encoded by a polynucleotide that 
hybridizes under stringent conditions to a hybridization probe comprising at least 30 con- 
secutive nucleotides of SEQ ID NO:42 or a complement of SEQ ID NO:42, wherein the 
5 polypeptide has the activity of HMG-CoA reductase, isopentenyl diphosphate isomerase, 
HMG-CoA synthase, isopentenyl diphosphate isomerase, mevalonate kinase, phosphome- 
valonate kinase, or diphosphomevalonate decarboxylase; and (j) a conservatively modified 
variant of SEQ ID NOs:43, 45, 47, 49, 51 or 53. 

As noted above, the present invention includes SEQ ID Nos: 43, 45, 47, 49, 51, and 53, 
10 which are polypeptide sequences that correspond to the following enzymes of the meva- 
lonate pathway: hydroxymethyl glutaryl CoA (HMG-CoA) reductase, isopentenyl di- 
phosphate (IPP) isomerase, HMG-CoA synthase, mevalonate kinase, phosphomevalonate 
kinase, and diphosphomevalonate decarboxylase, respectively. The present invention also 
includes at least 30 contiguous amino acids of each identified sequence or a sufficient 
15 number of contiguous amino acids to define a biologically active molecule. 

The present invention also includes fragments of a polypeptide selected from SEQ ID NOs: 
43, 45, 47, 49, 51, and 53. The fragment should be at least about 30 amino acids in length 
but must have the activity of the identified polypeptide, e.g., in the case of SEQ ID NO:43, 
a fragment thereof that falls within the scope of the present invention has the activity of 
20 HMG-CoA reductase. As used herein, a measure of activity of the respective fragments is 
set forth in Example 1. A fragment having an activity above background in the assays set 
forth in Example 1 is considered to be biologically active and within the scope of the 
present invention. \ 

The present invention also includes an amino acid sequence of a polypeptide encoded by a 
25 polynucleotide that hybridizes under stringent conditions, as defined above, to a 

hybridization probe that contains at least 30 contiguous nucleotides of SEQ ID NO:42 (i.e., 
the mevalonate operon) or a complement of SEQ ID NO:42. The polynucleotide must 
encode at least one of the enzymes in the mevalonate pathway. For purposes of the present 
invention, a "hybridization probe" is a polynucleotide sequence containing from about 10- 
30 9066 nucleotides of SEQ ID NO:42. 

In this embodiment, the isolated polypeptide may have the amino acid sequence of SEQ 
ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51 or SEQ ID 
NO:53. Alternatively, the isolated polypeptide may contain about 30 contiguous amino 
acids selected from an area of the respective amino acids sequences that have the least 
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identity when compared to an enzyme with the same function from different species. 
Thus, for example, a polypeptide of the present invention may include amino acids 68-97 
of SEQ ID NO:43, 1-30 of SEQ ID NO:45, 269-298 of SEQ ID NO:47, 109-138 of SEQ ID 
NO:49, 198-227 of SEQ ID NO:51 or 81-1 10 of SEQ ID NO:53. 

5 Another embodiment of the invention is an isolated polypeptide having an amino acid 
sequence selected from: (a) an amino acid sequence shown as residues 1 to 287 of SEQ ID 
NO:159; (b) at least 30 contiguous amino acid residues of SEQ ID NO:159; (c) an amino 
acid sequence of a fragment of SEQ ID NO: 159, the fragment having the activity of farnes- 
^diphosphates^ synthase); (d) an amino acid sequence of a polypeptide en- 

10 coded by a polynucleotide that hybridizes under stringent conditions to a hybridization 
probe containing at least 30 consecutive nucleotides of the ispA gene (i.e., nucleotides 295- 
1158 of SEQ ID NO: 157) or a complement thereof, wherein the polypeptide has the activi- 
ty of FPP synthase; and (e) conservatively modified variants of SEQ ID NO: 159. 

Thus, in this embodiment the amino acid may be encoded by the entire open reading 
15 frame that encodes FPP synthase, i.e, residues 1-287 of SEQ ID NO: 159, at least 30 conti- 
guous residues thereof, or a fragment of SEQ ID NO: 159 that has FPP synthase activity as 
measured by the assay set forth in Example 1. Furthermore, this embodiment of the in- 
vention also includes amino acid sequence(s) encoded by polynucleotide(s) that hybridize 
under stringent conditions, as defined above, to a hybridization probe that includes at least 
20 30 consecutive nucleotides of the ispA gene (i.e., nucleotides 295-1158 of SEQ ID NO:157) 
or a complement thereof, wherein the polypeptide has FPP synthase activity as defined 
above. 

In a preferred embodiment, the polypeptide has the amino acid sequence of SEQ ID 
NO:159. 

25 Another embodiment of the invention is an isolated polypeptide having an amino acid 

sequence selected from the following group: (a) an amino acid sequence shown as residues 
1 to 142 of SEQ ID NO: 160; (b) at least 30 contiguous amino acid residues of SEQ ID 
NO: 160; (c) an amino acid sequence of a fragment of SEQ ID NO: 160, the fragment 
having the activity o f 1 - deoxyxylulose - 5- phosphate synthase (DXPS); / (d) an amino acid 

30 sequence of a polypeptide encoded by a polynucleotide that hybridizes under stringent 
conditions to a hybridization probe containing at least 30 consecutive nucleotides 
spanning positions 1185-1610 of SEQ ID NO: 157 or a complement thereof, wherein the 
polypeptide has the activity of DXPS; and (e) conservatively modified variants of SEQ ID 
NO: 160. 
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Thus, in this embodiment the amino acid may be encoded by the entire open reading 
frame that encodes DXPS, i.e, residues 1-142 of SEQ ID NO:160, at least 30 contiguous 
residues thereof, or a fragment of SEQ ID NO: 160 that has DXPS activity as measured by 
as measured by the assay set forth in Example 1. Furthermore, this embodiment of the 
5 invention also includes amino acid sequence(s) encoded by polynucleotide(s) that hybri- 
dize under stringent conditions, as defined above, to a hybridization probe that includes at 
least 30 consecutive nucleotides of the DXPS gene (le.> nucleotides 1 185-1610 of SEQ ID 
NO: 157) or a complement thereof, wherein the polypeptide has DXPS activity as defined 
above. 

10 In a preferred embodiment, the polypeptide has the amino acid sequence of SEQ ID 
NO: 160. 

f 

Another embodiment of the invention is an isolated polypeptide having an amino acid 
sequence selected from: (a) an amino acid sequence shown as residues 1 to 390 of SEQ ID 
NO: 178; (b) at least 30 contiguous amino acid residues of SEQ ID NO: 178; (c) an amino 

15 acid sequence of a fragment of SEQ ID NO: 178, the fragment having the activity of acetyl- 
CoA acetyltransferase; (d) an amino acid sequence of a polypeptide encoded by a poly- 
nucleotide that hybridizes under stringent conditions to a hybridization probe containing 
at least 30 consecutive nucleotides of the phaA gene (i.e., nucleotides 1-1179 of SEQ ID 
NO: 177) or a complement thereof, wherein the polypeptide has the activity of acetyl-CoA 

20 acetyltransferase, and (e) conservatively modified variants of SEQ ID NO: 178. 

Thus, in this embodiment the amino acid may be encoded by the entire open reading 
frame that encodes acetyl-CoA acetyltransferase, i.e y residues 1-143 of SEQ ID NO: 178, at 
least 30 contiguous residues thereof, or a fragment of SEQ ID NO: 178 that has acetyl-CoA 
acetyltransferase activity as measured by the assay set forth in Example 1. Furthermore, 
25 this embodiment of the invention also includes amino acid sequence(s) encoded by poly- 
nucleotide^) that hybridize under stringent conditions, as defined above, to a hybridiza- 
tion probe that includes at least 30 consecutive nucleotides of the phaA gene (i.e., nucleo- 
tides 1-1 170 of SEQ ID NO:177), or a complement thereof, wherein the polypeptide has 
the acetyl-CoA acetyltransferase activity as defined above. 

30 In a preferred embodiment, the polypeptide has the amino acid sequence of SEQ ID 
NO:178. 

Another embodiment of the invention is an isolated polypeptide having an amino acid 
sequence selected from: (a) an amino acid sequence shown as residues 1 to 240 of SEQ ID 
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NO: 179; (b) at least 30 contiguous amino acid residues of SEQ ID NO: 179; (c) an amino 
acid sequence of a fragment of a polypeptide of SEQ ID NO: 179, the fragment having the 
activity of acetoacetyl-CoA reductase; (d) an amino acid sequence of a polypeptide en- 
coded by a polynucleotide that hybridizes under stringent conditions to a hybridization 
5 probe containing at least 30 consecutive nucleotides of the phaB gene (i.e., nucleotides 
1258-1980 of SEQ ID NO: 177) or a complement thereof, wherein the polypeptide has the 
activity of acetoacetyl-CoA reductase; and (e) conservatively modified variants of SEQ ID 
NO: 179. 

Thus, in this embodiment the amino acid may be encoded by the entire open reading 
10 frame that encodes acetoacetyl-CoA reductase, j.e, residues 1-240 of SEQ ID NO: 179, at 
least 30 contiguous residues thereof, or a fragment of SEQ ID NO: 179 that has acetoacetyl- 
CoA reductase activity as measured by the assay set forth in Example 1. Furthermore, this 
embodiment of the invention also includes amino acid sequence(s) encoded by poly- 
nucleotide^) that hybridize under stringent conditions, as defined above, to a hybridiza- 
15 tion probe that includes at least 30 consecutive nucleotides of the phaB gene (i.e., nucleo- 
tides 1258-1980 of SEQ ID NO: 177) or a complement thereof, wherein the polypeptide has 
acetoacetyl-CoA reductase activity as defined above. 

In a preferred embodiment, the polypeptide has the amino acid sequence of SEQ ID 
NO: 179. 

20 The terms "polypeptide," "polypeptide sequence," "amino acid," and "amino acid 
sequence" are used interchangeably herein, and mean an oligopeptide, peptide, poly- 
peptide, or protein sequence, or a fragment of any of these, as well as naturally occurring 
or synthetic molecules. In this context, "fragments," "immunogenic fragments," or "anti- 
genic fragments" refer to fragments of any of the polypeptides defined herein which are at 

25 least about 30 amino acids in length and which retain some biological activity or immun- 
ological activity of the polypeptide in question. Where "amino acid sequence" is recited 
herein to refer to an amino acid sequence of a naturally occurring protein molecule, 
"amino acid sequence" and like terms are not meant to limit the amino acid sequence to 
the complete native amino acid sequence associated with the recited protein molecule. 

30 With respect to polypeptides, the term "isolated" means a protein or a polypeptide that has 
been separated from components that accompany it in its natural state. A monomeric pro- 
tein is isolated when at least about 60 to 75% of a sample exhibits a single polypeptide 
sequence. An isolated protein will typically comprise about 60 to 90% W/W of a protein 
sample, more usually about 95%, and preferably will be over about 99% pure. Protein 
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purity or homogeneity may be indicated by a number of means well known in the art, such 
as polyacrylamide gel electrophoresis of a protein sample, followed by visualizing a single 
polypeptide band upon staining the gel. For certain purposes, using HPLC or other means 
well known in the art may provide higher resolution for purification. 

5 As used herein, the term "biologically active," refers to a protein having structural, regula- 
tory, or biochemical functions of a naturally occurring molecule. Likewise, "immunologi- 
cally active" refers to the capability of the natural, recombinant, or synthetic polypeptide, 
or of any oligopeptide thereof, to induce a specific immune response in appropriate ani- 
mals or cells and to bind with specific antibodies. 

10 Another embodiment of the invention is an isolated polynucleotide sequence having the 
nucleotide sequence of the mevalonate operon (SEQ ID NO:42), variants of SEQ ID V 
NO:42 containing one or more substitutions according to the Paracoccus sp. strain 1534 
codon usage table {see Table 14) or fragments of SEQ ID NO:42. The variants and frag- 
ments of SEQ ID NO:42 must encode a polypeptide having an activity selected from: 

15 HMG-CoA reductase, isopentenyl diphosphate isomerase activity, hydroxymethylglutaryl- 
CoA synthase (HMG-CoA synthase), mevalonate kinase, phosphomevalonate kinase, and 
diphosphomevalonate decarboxylase. This embodiment also includes polynucleotide 
sequences that hybridize under stringent conditions, as defined above, to a hybridization 
probe, the nucleotide sequence of which consists of from about 10 to about 9066 nucleo- 

20 tides of SEQ ID NO:42, preferably at least 30 contiguous nucleotides of SEQ ID NO:42, or 
a complement of such sequences, which polynucleotide encodes a polypeptide having an 
activity selected from: HMG-CoA reductase, isopentenyl diphosphate isomerase, HMG- 
CoA synthase, mevalonate kinase, phosphomevalonate kinase, and diphosphomevalonate C 
decarboxylase. 

25 This embodiment also includes isolated polynucleotide sequences spanning the following 
residues of SEQ ID NO:42: 2622 to 3644, 3641 to 4690, 4687 to 5853, 5834 to 6970, 6970 
to 7887, 7880 to 8878. Fragments of these sequences are also within the scope of the 
invention, so long as they encode a polypeptide having HMG-CoA reductase activity, 
isopentenyl diphosphate isomerase activity, HMG-CoA synthase activity, mevalonate 

30 kinase activity, phosphomevalonate kinase activity, and diphosphomevalonate de- 
carboxylase activity, respectively. 

This embodiment also includes polynucleotide sequences that hybridize under stringent 
conditions, as defined above, to a hybridization probe selected from a nucleotide sequence 
which consists of at least 30 contiguous nucleotides of the following residues of SEQ ID 
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NO:42: 2622 to 3644, 3641 to 4690, 4687 to 5853, 5834 to 6970, 6970 to 7887, 7880 to 8878 
or a complement thereof, wherein the polynucleotide encodes a polypeptide having HMG- 
CoA reductase activity, isopentenyl diphosphate isomerase activity, HMG-CoA synthase 
activity, mevalonate kinase activity, phosphomevalonate kinase activity, or diphospho- 
5 mevalonate decarboxylase activity, respectively. 

Preferably, the isolated polynucleotide consists of nucleotides 2622 to 3644, 3641 to 4690, 
4687 to 5853, 5834 to 6970, 6970 to 7887 or 7880 to 8878 of SEQ ID NO:42. 

Another embodiment of the invention is an isolated polynucleotide sequence having the 
nucleotide sequence of SEQ ID NO:157, variants of SEQ ID NO:157 containing one or 

10 more substitutions according to the Paracoccus sp. strain 1534 codon usage table {see Table 
14) or fragments of SEQ ID NO:157 that encode a polypeptide having FPP synthase acti- 
vity, 1-deoxy-D-xylulose 5-phosphate synthase activity or the activity of XseB; This em- 
bodiment also includes polynucleotide sequences that hybridize under stringent condi- 
tions, as defined above, to a hybridization probe the nucleotide sequence of which consists 

15 of at least 30 contiguous nucleotides of SEQ ID NO: 157, or the complement of SEQ ID 
NO: 157, wherein the polynucleotide encodes a polypeptide having FPP synthase activity, 
1-deoxy-D-xylulose 5-phosphate synthase activity or the activity of XseB. 

Preferably, the isolated polynucleotide consists of nucleotides 59-292, 295-1 158 or 1185- 
1610 of SEQ ID NO:157. 

20 An isolated polynucleotide sequence is also provided that has a nucleotide sequence 
selected from the following group: nucleotides spanning positions 59-292 of SEQ ID 
NO: 157, variants of the nucleotide sequence spanning positions of SEQ ID NO: 157 con- 
taining one or more substitutions according to the Paracoccus sp. strain R1534 codon 
usage table (Table 14), fragments of the nucleotide sequence spanning positions 59-292 of 

25 SEQ ID NO: 157 that encode a polypeptide having a function of XseB, and polynucleotide 
sequences that hybridize under stringent conditions to a hybridization probe the nucleo- 
tide sequence of which consists of at least 30 contiguous nucleotides spanning positions 
59-292 of SEQ ID NO: 157, or the complement of such a sequence, wherein the poly- 
nucleotide encodes a polypeptide having a function of XseB. 

30 Preferably, the isolated polynucleotide consists of nucleotides 59 to 292 of SEQ ID 
NO:157. 

An isolated polynucleotide sequence is also provided that has a nucleotide sequence 
selected from the following group: nucleotides spanning positions 295-1158 of SEQ ID 
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NO:157, variants of the nucleotide sequence spanning positions 295-1 158 of SEQ ID 
NO: 157 containing one or more substitutions according to the Paracoccus sp. strain R1534 
codon usage table (Table 14), fragments of the nucleotide sequence spanning positions 
295-1 158 of SEQ ID NO: 157 that encode a FPP synthase activity, and polynucleotide 
5 sequences that hybridize under stringent conditions to a hybridization probe the nucleo- 
tide sequence of which consists of at least 30 contiguous nucleotides spanning positions 
295-1 158 of SEQ ID NO:157, or the complement of such a sequence, wherein the poly- 
nucleotide encodes a polypeptide having FPP synthase activity. 

Preferably, the isolated nucleotide sequence consists of nucleotides 295-1158 of SEQ ID 
10 NO: 157. 

Another embodiment of the invention is an isolated polynucleotide sequence having the ( 
nucleotide sequence spanning positions 1185-1610 of SEQ ID NO:157, variants of the 
nucleotide sequence spanning positions 1185-1610 of SEQ ID NO:157 containing one or 
more substitutions according to the Paracoccus sp. strain 1534 codon usage table (see Table 

15 14) or fragments of the nucleotide sequence spanning positions 1185-1610 of SEQ ID 
NO: 157 that encode a polypeptide having l-deoxyxylulose-5-phosphate synthase activity. 
This embodiment also includes polynucleotide sequences that hybridize under stringent 
conditions, as defined above, to a hybridization probe the nucleotide sequence of which 
consists of at least 30 contiguous nucleotides spanning positions 1185-1610 of SEQ ID 

20 NO: 157, or a complement thereof, wherein the polynucleotide encodes a polypeptide 
having l-deoxyxylulose-5-phosphate synthase activity. 

Preferably, the isolated polynucleotide consists of nucleotides 1 185 to 1610 of SEQ ID 
NO: 157. ^ 

Another embodiment of the invention is an isolated polynucleotide sequence having the 
25 nucleotide sequence of SEQ ID NO: 177, variants of SEQ ID NO: 177 containing one or 

more substitutions according to the Paracoccus sp. strain 1534 codon usage table (see Table 
14) or fragments of SEQ ID NO:177 that encode a polypeptide having an activity selected 
from acetyl-CoA acetyltransferase and acetoacetyl-CoA reductase. This embodiment also 
includes polynucleotide sequences that hybridize under stringent conditions, as defined 
30 above, to a hybridization probe the nucleotide sequence of which consists of at least 30 
contiguous nucleotides of SEQ ID NO: 177, or a complement thereof, which polynucleo- 
tide encodes a polypeptide having an activity selected from the group consisting of acetyl - 
CoA acetyltransferase and acetoacetyl-CoA reductase. 
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In this embodiment the isolated polynucleotide sequence may include nucleotides 1 to 
1170 of SEQ ID NO: 177, variants of SEQ ID NO: 177 containing one or more substitutions 
according to the Paracoccus sp. strain 1534 codon usage table {see Table 14) or fragments 
of SEQ ID NO: 177 that encode a polypeptide having acetyl-CoA acetyltransferase activity. 
5 This embodiment also includes polynucleotide sequences that hybridize under stringent 
conditions to a hybridization probe the nucleotide sequence of which consists of at least 30 
contiguous nucleotides of nucleotides 1 to 1170 of SEQ ID NO:177, or a complement 
thereof, wherein the polynucleotide encodes a polypeptide having acetyl-CoA acetyl- 
transferase activity. 

10 Preferably, the isolated polynucleotide sequence consists of nucleotides 1-1170 of SEQ ID 
NO:177. 

In this embodiment, the isolated polynucleotide sequence may alternatively be nucleotides 
1258-1980 of SEQ ID NO:177, variants of SEQ ID NO:177 containing one or more sub- 
stitutions according to the Paracoccus sp. strain 1534 codon usage table {see Table 14) or 

15 fragments of SEQ ID NO:177 that encode a polypeptide having acetoacetyl-CoA reductase 
activity. This embodiment also includes polynucleotide sequences that hybridize under 
stringent conditions to a hybridization probe the nucleotide sequence of which consists of 
at least 30 contiguous nucleotides of nucleotides 1258-1980 of SEQ ID NO:177, or a com- 
plement thereof, wherein the polynucleotide encodes a polypeptide having acetoacetyl- 

20 CoA reductase activity. 

Preferably, the isolated polynucleotide consists of nucleotides 1258-1980 of SEQ ID 
NO:177. 

In another embodiment of the invention, the isolated polynucleotide sequence has a 
nucleotide sequence selected from SEQ ID NO:42, SEQ ID NO:157, SEQ ID NO:177, and 

25 combinations thereof. As used herein, the phrase "and combinations thereof when used 
in reference to nucleotide sequences means that any combination of the recited sequences 
may be combined to form the isolated polynucleotide sequence. Moreover, in the present 
invention, multiple copies of the same sequence, i.e., concatamers may be used. Likewise, 
and as set forth in more detail below, multiple copies of plasmids containing the same 

30 polynucleotide sequence may be transferred into suitable host cells. 

As used herein, an "isolated" polynucleotide {e.g., an RNA, DNA or a mixed polymer) is 
one which is substantially separated from other cellular components which naturally 
accompany a native sequence or polypeptide, e.g., ribosomes, polymerases, many other 
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genome sequences and proteins. The term embraces a polynucleotide that has been re- 
moved from its naturally occurring environment, and includes recombinant or cloned 
DNA isolates and chemically synthesized analogs or analogs biologically synthesized by 
heterologous systems. 

5 The phrase "nucleic acid sequence" refers to a single or double-stranded polymer of 
deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end. It includes 
chromosomal DNA, self- replicating plasmids, infectious polymers of DNA or RNA and 
DNA or RNA that performs a primarily structural role. 

An "expression control sequence" is defined as an array of nucleic acid control sequences 
10 that direct transcription of an operably linked nucleic acid. An example of such an ex- 
pression control sequence is a "promoter." Promoters include necessary nucleic acid ( 
sequences near the start site of transcription. A promoter also optionally includes distal 
enhancer or repressor elements, which can be located as much as several thousand base 
pairs from the start site of transcription. A "constitutive" promoter is a promoter that is 
15 active under most environmental and developmental conditions. An "inducible" promoter 
is a promoter that is active under environmental or developmental regulation. The term 
"operably linked" refers to a functional linkage between a nucleic acid expression control 
sequence (such as a promoter or array of transcription factor binding sites) and a second 
nucleic acid sequence, wherein the expression control sequence directs transcription of the 
20 nucleic acid corresponding to the second sequence. 

A polynucleotide sequence is "heterologous to" an organism or a second polynucleotide 
sequence if it originates from a foreign species, or, if from the same species, is modified 
from its original form. For example, a promoter operably linked to a heterologous coding ^ 
sequence refers to a coding sequence from a species different from that from which the 
25 promoter was derived, or, if from the same species, a coding sequence which is different 
from any naturally occurring allelic variants. 

In the case of both expression of transgenes and inhibition of endogenous genes (e.g., by 
antisense, or sense suppression) one of skill will recognize that the inserted polynucleotide 
sequence need not be identical, but may be only "substantially identical" to a sequence of 
30 the gene from which it was derived. 

In the case where the inserted polynucleotide sequence is transcribed and translated to 
produce a functional polypeptide, one of skill will recognize that because of codon degene- 
racy a number of polynucleotide sequences will encode the same polypeptide. These vari- 
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ants are specifically within the scope of the present invention. In addition, the present in- 
vention specifically includes those sequences that are substantially identical (determined as 
described below) to each other and that encode polypeptides that are either mutants of 
wild type polypeptides or retain the function of the polypeptide {e.g., resulting from con- 
5 servative substitutions of amino acids in the polypeptide). In addition, variants can be 
those that encode dominant negative mutants as described below. 

Two nucleic acid sequences or polypeptides are said to be "identical" if the sequence of 
nucleotides or amino acid residues, respectively, in the two sequences is the same when 
aligned for maximum correspondence as described below. The terms "identical" or 
10 percent "identity," in the context of two or more nucleic acids or polypeptide sequences, 
refer to two or more sequences or subsequences that are the same or have a specified 
percentage of amino acid residues or nucleotides that are the same, when compared and 
aligned for maximum correspondence over a comparison window, as measured using one 
of the following sequence comparison algorithms or by manual alignment and visual 

15 inspection. When percentage of sequence identity is used in reference to proteins or 
peptides, it is recognized that residue positions that are not identical often differ by 
conservative amino acid substitutions, where amino acids residues are substituted for 
other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) 
and therefore do not change the functional properties of the molecule. Where sequences 

20 differ in conservative substitutions, the percent sequence identity may be adjusted 

upwards to correct for the conservative nature of the substitution. Means for making this 
adjustment are well known to those of skill in the art. Typically this involves scoring a 
conservative substitution as a partial rather than a full mismatch, thereby increasing the 
percentage sequence identity. Thus, for example, where an identical amino acid is given a 

25 score of 1 and a non-conservative substitution is given a score of zero, a conservative 

substitution is given a score between zero and 1. The scoring of conservative substitutions 
is calculated according to, e.g., the algorithm of Meyers & Miller, Computer Applic. Biol. 
Sci. 4:1 1-17 (1988), e.g., as implemented in the program PC/GENE (Intelligenetics, 
Mountain View, Calif., USA). 

30 The phrase "substantially identical," in the context of two nucleic acids or polypeptides, 
refers to sequences or subsequences that have at least 60%, preferably 80%, most prefer- 
ably 90-95%, nucleotide or amino acid residue identity when aligned for maximum 
correspondence over a comparison window as measured using one of the following 
sequence comparison algorithms or by manual alignment and visual inspection. This 
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definition also refers to a sequence of which the complement of that sequence hybridizes 
to the test sequence. 

For sequence comparison, typically one sequence acts as a reference sequence, to which 
test sequences are compared. When using a sequence comparison algorithm, test and 
5 reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Default program 
parameters can be used, or alternative parameters can be designated. The sequence compa- 
rison algorithm then calculates the percent sequence identities for the test sequences rela- 
tive to the reference sequence, based on the program parameters. 

10 A "comparison window," as used herein, includes reference to a segment of any one of the 
number of contiguous positions selected from the group consisting of from 20 to 600, ( 
usually about 50 to about 200, more usually about 100 to about 150, in which a sequence 
may be compared to a reference sequence of the same number of contiguous positions 
after the two sequences are optimally aligned. Methods of alignment of sequences for com- 

15 parison are well known in the art. Optimal alignment of sequences for comparison can be 
conducted, e.g. 9 by the local homology algorithm of Smith and Waterman, Adv. Appl. 
Math. 2:482 (1981), by the homology alignment algorithm of Needleman and Wunsch, J. 
Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman, Proc. 
Nat'l, Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algo- 

20 rithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software 
Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual 
alignment and visual inspection. 

One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence align- ^ 
ment from a group of related sequences using progressive, pairwise alignments to show 

25 relationship and percent sequence identity. It also plots a tree or dendogram showing the 
clustering relationships used to create the alignment. PILEUP uses a simplification of the 
progressive alignment method of Feng and Doolittle, J. Mol. Evol. 35:351-360 (1987). The 
method used is similar to the method described by Higgins and Sharp, CABIOS 5:151-153 
(1989). The program can align up to 300 sequences, each of a maximum length of 5,000 

30 nucleotides or amino acids. The multiple alignment procedure begins with the pairwise 

alignment of the two most similar sequences, producing a cluster of two aligned sequences. 
This cluster is then aligned to the next most related sequence or cluster of aligned 
sequences. Two clusters of sequences are aligned by a simple extension of the pairwise 
alignment of two individual sequences. The final alignment is achieved by a series of pro- 

35 gressive, pairwise alignments. The program is run by designating specific sequences and 
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their amino acid or nucleotide coordinates for regions of sequence comparison and by 
designating the program parameters. For example, a reference sequence can be compared 
to other test sequences to determine the percent sequence identity relationship using the 
following parameters: default gap weight (3.00), default gap length weight (0.10), and 
5 weighted end gaps. 

Another example of an algorithm that is suitable for determining percent sequence identity 
and sequence similarity is the BLAST algorithm [Altschul et al., J. Mol. Biol. 215:403-410 
(1990)]. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

10 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive- 
valued threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al., 
supra). These initial neighborhood word hits act as seeds for initiating searches to find 

15 longer HSPs containing them. The word hits are extended in both directions along each 
sequence for as far as the cumulative alignment score can be increased. Extension of the 
word hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 

20 either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLAST program uses as defaults a wordlength 
(W) of 11, the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. 
Sci. USA 89:10915 (1989)) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a 
comparison of both strands. 

25 The BLAST algorithm also performs a statistical analysis of the similarity between two 

sequences [see, e.g., Karlin and Altschul, Proc. Natl Acad. Sci. USA 90:5873-5787 (1993)]. 
One measure of similarity provided by the BLAST algorithm is the smallest sum probabi- 
lity (P(N)), which provides an indication of the probability by which a match between two 
nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is 

30 considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. 

"Conservatively modified variants" applies to both amino acid and nucleic acid sequences. 
With respect to particular nucleic acid sequences, conservatively modified variants refers 
35 to those nucleic acids which encode identical or essentially identical amino acid sequences, 
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or where the nucleic acid does not encode an amino acid sequence, to essentially identical 
sequences. Because of the degeneracy of the genetic code, a large number of functionally 
identical nucleic acid codons encode any given protein. For instance, the codons GCA, 
GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an 
5 alanine is specified by a codon, the codon can be altered to any of the corresponding 
codons described without altering the encoded polypeptide. Such nucleic acid variations 
are "silent variations," which are one species of conservatively modified variations. Every 
nucleic acid sequence herein that encodes a polypeptide also describes every possible silent 
variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid 
10 (except AUG, which is ordinarily the only codon for methionine) can be modified to yield 
a functionally identical molecule. Accordingly, each silent variation of a nucleic acid that 
encodes a polypeptide is implicit in each described sequence. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 
deletions or additions to a nucleic acid, or substitutions to a peptide, polypeptide, or pro- 
15 tein sequence which alters a single amino acid or a small percentage of amino acids (i.e. 
less than 20%, such as 15%, 10%, 5%, 4%, 3%, 2% or 1%) in the encoded sequence is a 
"conservatively modified variant" where the alteration results in the substitution of an 
amino acid with a chemically similar amino acid. Conservative substitution tables provid- 
ing functionally similar amino acids are well known in the art. 

20 The following six groups each contain amino acids that are conservative substitutions for 
one another: 

Alanine (A), Serine (S), Threonine (T); 

Aspartic acid (D), Glutamic acid (E); V 
Asparagine (N), Glutamine (Q); 
25 Arginine (R), Lysine (K); 

Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 
Phenylalanine (F), Tyrosine (Y), Tryptophan (W). (see, e.g., Creighton, Proteins 
(1984)). 

An indication that two nucleic acid sequences or polypeptides are substantially identical is 
30 that the polypeptide encoded by the first nucleic acid is immunologically cross reactive 
with the antibodies raised against the polypeptide encoded by the second nucleic acid. 
Thus, a polypeptide is typically substantially identical to a second polypeptide, for 
example, where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two 
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molecules or their complements hybridize to each other under stringent conditions, as 
described below. 

The phrase "specifically hybridizes to" refers to the binding, duplexing, or hybridizing of a 
molecule only to a particular nucleotide sequence under stringent hybridization 
5 conditions when that sequence is present in a complex mixture {e.g., total cellular or 
library DNA or RNA). 

The phrase "stringent hybridization conditions" refers to conditions under which a probe 
will hybridize to its target sequence, typically in a complex mixture of nucleic acid sequen- 
ces, but to no other sequences. Stringent conditions are sequence- dependent and will be 
10 different in different circumstances. Longer sequences hybridize specifically at higher tem- 
peratures. An extensive guide to the hybridization of nucleic acids is found in Tijssen, 
Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Probes, 
"Overview of Principles of Hybridization and the Strategy of Nucleic Acid Assays" (1993). 
Generally, highly stringent conditions are selected to be about 5-10°C lower than the 
15 thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH. 
Low stringency conditions are generally selected to be about 15-30°C below the T m . The 
T m is the temperature (under defined ionic strength, pH, and nucleic acid concentration) 
at which 50% of the probes complementary to the target hybridize to the target sequence 
at equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are 
20 occupied at equilibrium). Stringent conditions will be those in which the salt 

concentration is less than about 1.0M sodium ion, typically about 0.01 to 1.0M sodium ion 
concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C 
for short probes (e.g., 10 to 50 nucleotides) and at least about 60°C for long probes (e.g. 9 
greater than 50 nucleotides). Stringent conditions may also be achieved with the addition 
25 of destabilizing agents such as formamide. For selective or specific hybridization, a positive 
signal is at least two times background, preferably 10 times background hybridization. 

Nucleic acids that do not hybridize to each other under stringent conditions are still sub- 
stantially identical if the polypeptides that they encode are substantially identical. This 
occurs, for example, when a copy of a nucleic acid is created using the maximum codon 
30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybri- 
dize under moderately stringent hybridization conditions. 

In the present invention, genomic DNA or cDNA containing nucleic acids of the invention 
can be identified in standard Southern blots under stringent conditions using the nucleic 
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acid sequences disclosed here. For the purposes of this disclosure, suitable stringent condi- 
tions for such hybridizations are those which include hybridization in a buffer of 40% 
formamide, 1M NaCl, 1% sodium dodecyl sulfate (SDS) at 37°C, and at least one wash in 

0.2X SSC at a temperature of at least about 50°C, usually about 55°C to about 60°C, for 20 
5 minutes, or equivalent conditions. A positive hybridization is at least twice background. 
Those of ordinary skill will readily recognize that alternative hybridization and wash con- 
ditions can be utilized to provide conditions of similar stringency. 

A further indication that two polynucleotides are substantially identical is if the reference 
sequence, amplified by a pair of oligonucleotide primers, can then be used as a probe 
10 under stringent hybridization conditions to isolate the test sequence from a cDNA or 

genomic library, or to identify the test sequence in, e.g., a northern or Southern blot. f 

The present invention also includes expression vectors as defined above. The expression 
vectors include one or more copies of each of the polynucleotide sequences set forth above. 
The expression vectors of the present invention may contain any of the polynucleotide 

15 sequences defined herein, such as for example SEQ ID NO:42, or the following residues of 
SEQ ID NO:42: 2622 to 3644, 3641 to 4690, 4687 to 5853, 5834 to 6970, 6970 to 7887, 
7880 to 8878, as well as residues 59-292, 295-1158 or 1185-1610 of SEQ ID NO:157 and 
residues 1-1170 or 1258-1980 of SEQ ID NO:177. The expression vectors may contain 
combinations of the polynucleotide sequences identified herein, such as for example, SEQ 

20 ID NO:42, SEQ ID NO:157, and SEQ ID NO:177. 

The polynucleotide sequences in the expression vectors may optionally be operably linked 
to an expression control sequence as defined above and exemplified in the Examples. ^ 

The present invention also includes for example, the following expression vectors: pBBR- 
K-mev-opl6-l, pBBR-K-mev-opl6-2, pDS-mraA, pDS-uii, pDS-ftcs, pDS-mvA:, pDS-pmfc, 

25 pDS-mvd, pDS-His-mvaA, pDS-His-uii, pDS-His-fccs, pDS-His-mvfc, pDS-His-pm/c, 
pDS-His-mwi, pBBR-K-Zea4, pBBR-K-Zea4-up, pBBR-K-Zea4-down, pBBR-K-Pcrf£- 
crf£-3, pBBR-tK-PcrfcE-mvaA, pBBR-tK-Pot£-u*i, pBBR-tK-P<rt£-/*<;5, pBBR-tK-Pcrt£- 
mvk y pBBR-tK- Pcrt£-pm/c, pBBR-tK-Po*£-mv<i, pBBR-K-Pcrt£-mvrtA-crf£-3, pDS-His- 
phaA> pBBR-K-Pcrf£-crtW, pBBR-K-Pcrt£-crtWZ, pBBR- K- PcrtE-crtZW, and 

30 combinations thereof. These expression vectors are defined in more detail in the examples 
below. Moreover, the present invention also includes any expression vector that contains 
one of the sequences defined herein, which expression vector is used to express an 
isoprenoid compound, such as a carotenoid, preferably zeaxanthin, in a suitable host cell. 
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As used herein, the phrase "expression vector" is a replicatable vehicle that carries, and is 
capable of mediating the expression of, a DNA sequence encoding the polynucleotide 
sequences set forth herein. 

In the present context, the term "replicatable" means that the vector is able to replicate in a 
5 given type of host cell into which it has been introduced. Immediately upstream of the 
polynucleotide sequence(s) of interest, there may be provided a sequence coding for a 
signal peptide, the presence of which ensures secretion of the encoded polypeptide ex- 
pressed by host cells harboring the vector. The signal sequence may be the one naturally 
associated with the selected polynucleotide sequence or of another origin. 

10 The vector may be any vector that may conveniently be subjected to recombinant DNA 
procedures, and the choice of vector will often depend on the host cell into which it is to 
be introduced. Thus, the vector may be an autonomously replicating vector, i.e. a vector 
that exists as an extrachromosomal entity, the replication of which is independent of 
chromosomal replication; examples of such a vector are a plasmid, phage, cosmid or mini- 

15 chromosome. Alternatively, the vector may be one which, when introduced in a host cell, 
is integrated in the host cell genome and is replicated together with the chromosome(s) 
into which it has been integrated. Examples of suitable vectors are shown in the examples. 
The expression vector of the invention may carry any of the DNA sequences of the inven- 
tion as defined below and be used for the expression of any of the polypeptides of the in- 

20 vention defined below. 

The present invention also includes cultured cells containing one or more of the poly- 
nucleotide sequences and/or one or more of the expression vectors disclosed herein. As 
used herein, a "cultured cell" includes any cell capable of growing under defined condi- 
tions and expressing one or more of polypeptides encoded by a polynucleotide of the pre- 

25 sent invention. Preferably, the cultured cell is a yeast, fungus, bacterium, or alga. More 
preferably, the cultured cell is a Paracoccus, Flavobacterium, Agrobacterium, Alcaligenes, 
Erwinia, E. coli or B. subtilis. Even more preferably, the cell is a Paracoccus, such as for 
example, R-1506, R-1512, R1534 or Rl 14. The present invention also includes the progeny 
of any of the cells identified herein that express a polypeptide disclosed herein. In the 

30 present invention, a cell is a progeny of another cell if its AFLP DNA fingerprint is 

indistinguishable using the conditions set forth in Example 2 from the fingerprint of the 
putative parental cell. 

Thus, the cultured cells according to the present invention may contain, for example, SEQ 
ID NO:42, or the following residues of SEQ ID NO:42: 2622 to 3644, 3641 to 4690, 4687 
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to 5853, 5834 to 6970, 6970 to 7887, 7880 to 8878, as well as residues 59-292, 295-1158 or 
1 185-1610 of SEQ ID NO:157 and residues 1-1 170 or 1258-1980 of SEQ ID NO:177. 
These sequences may be transferred to the cells alone or as part of an expression vector. 
These sequences also may optionally be operatively linked to expression control 
sequence(s). The cultured cells may also contain combinations of the polynucleotide 
sequences identified herein, such as for example, SEQ ID NO:42, SEQ ID NO:157, and 
SEQ IDNO:177. 

The cultured cells according to the present invention may further contain polynucleotides 
that encode one or more enzymes in the carotenoid biosynthetic pathway. For example, 
the cultured cells according to the present invention may contain one or more copies of 
SEQ ID NOs:180, 182, and 184 alone or in combination with any of the polynucleotide ^ 
sequences identified herein. Thus, the polynucleotide sequences disclosed herein may be 
transferred into a cultured cell alone or in combination with another polynucleotide 
sequence that would provide enhanced production of the target isoprenoid compound, 
such as, for example, carotenoids like zeaxanthin or astaxanthin. In this regard, the pre- 
sent invention includes the use of any polynucleotide encoding, for example, a polypeptide 
involved in carotenoid biosynthesis, such as GGPP synthase, 3-carotene-04-oxygenase 
(ketolase), and/or (3-carotene hydroxylase. In addition, combinations of polynucleotides 
encoding polypeptides involved in carotenoid biosynthesis may be used in combination 
with one or more of the polynucleotides identified herein on the same or different ex- 
pression vectors. Such constructs may be transferred to a cultured cell according to the 
present invention to provide a cell that expresses an isoprenoid of interest. 

( 

For example, a cultured cell according to the present invention may contain one or more 
of the following expression vectors: pBBR-K-mev-opl6-l, pBBR-K-mev-opl6-2, pDS- 
mvaA, pDS-idi, pDS-hcs, pDS-mvk, pDS-pmk, pDS-mvd, pDS-His-mvaA, pDS-His-ufc, 
pDS-His-/zcs, pDS-His-mvfc, pDS-His-pmfc, pDS-His-mW, pBBR-K-Zea4, pBBR-K-Zea4- 
up, pBBR-K-Zea4-down, pBBR-K-Pcrf£-crf£-3, pBBR-tK-PcrfE-mvaA, pBBR-tK-Pcrf£- 
idi, pBBR-tK-Pcrf£-/ics, P BBR-tK-Pcrf£-mvfc, pBBR-tK-P(Xr£-pmfc, pBBR-tK-Pcrr£-mvd, 
P BBR-K-Pcrt£-mvaA-crt£-3, pDS-His-phaA, pBBR-K-Pcrff-crfW, pBBR-K-Pcrf£-crfWZ, 
pBBR-K-Pcrr£-crtZW, and combinations thereof. 

Another embodiment of the invention is a method of producing a carotenoid. In this 
method, a cultured cell as defined above is cultured under conditions permitting ex- 
pression of a polypeptide encoded by the polynucleotide sequence as defined above. 
Culture conditions that permit expression of a polypeptide are provided in the Examples 
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below, but may be modified, if required, to suit the particular intended use. The caro- 
tenoid is then isolated from the cell or, if secreted, from the medium of the cell. 

In the present invention, a "carotenoid" includes the following compounds: phytoene, 
lycopene, p-carotene, zeaxanthin, canthaxanthin, astaxanthin, adonixanthin, crypto- 
5 xanthin, echinenone, adonirubin, and combinations thereof. Preferably, the carotenoid is 
zeaxanthin. 

Another embodiment of the invention is a method of making a carotenoid-producing cell. 
This method includes (a) introducing into a cell a polynucleotide sequence encoding an 
enzyme in the mevalonate pathway, which enzyme is expressed in the cell; and (b) select- 
10 ing a cell containing the polynucleotide sequence of step (a) that produces a carotenoid at 
a level that is about 1.1-1,000 times the level of the carotenoid produced by the cell before 
introduction of the polynucleotide sequence. 

As used herein, the phrase "an enzyme in the mevalonate pathway" means the enzymes 
involved in the mevalonate pathway for IPP biosynthesis and encoded by the atoB or phaA, 

15 hcs y tnvaAy mvk 9 ptnk* and mvd genes. For purposes of the present invention, an enzyme is 
"expressed in the cell" if it is detected using any one of the activity assays set forth in 
Example 1. Assays for detecting the production of a carotenoid are well known in the art. 
Examples 1, 11, and 12 provide typical assay procedures for identifying the presence of 
zeaxanthin, lycopene, and astaxanthin, respectively. In a similar manner, assays for the 

20 other carotenoids may be used to detect the presence in the cell or medium of e.g. phyto- 
ene, canthaxanthin, adonixanthin, cryptoxanthin, echinenone, and adonirubin. 

Thus, this method may be used to make the following exemplary carotenoids: phytoene, 
lycopene, p-carotene, zeaxanthin, canthaxanthin, astaxanthin, adonixanthin, crypto- 
xanthin, echinenone, adonirubin, and combinations thereof. In this method, zeaxanthin 
25 is the preferred carotenoid. 

This method includes producing cells capable of producing a carotenoid at a level that is 
about 1.1-1,000 times, preferably about 1.5-500 times, such as about 100 times or at least 
10 times, the level of the carotenoid produced by the cell before introduction of the poly- 
nucleotide sequence. 

30 In this method, the cell produces from about 1 mg/L to about 10 g/L of a carotenoid. It is 
preferred that the cell produces from about 100 mg/L to about 9 g/L, such as, for example, 
from about 500mg/L to about 8 g/L, or from about 1 g/L to about 5 g/L, of a carotenoid. 
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In this method, the cell may be selected from a yeast, fungus, bacterium, and alga. Prefer- 
ably, the cell is a bacterium selected from Paracoccus, Flavobacterium, Agrobacterium, 
AlcaligeneSy Erwinia, E. coli, and B. subtilis. More preferably, the bacterium is a Paracoccus. 

In this method, the cell may be a mutant cell. As used herein, a "mutant cell" is any cell 
5 that contains a non-native polynucleotide sequence or a polynucleotide sequence that has 
been altered from its native form (e.g. y by rearrangement or deletion or substitution of 
from 1-100, preferably 20-50, more preferably less than 10 nucleotides). Such a non- 
native sequence may be obtained by random mutagenesis, chemical mutagenesis, UV-ir- 
radiation, and the like. Preferably, the mutation results in the increased expression of one 
10 or more genes in the mevalonate pathway that results in an increase in the production of a 
carotenoid, such as zeaxanthin. Methods for generating, screening for, and identifying , 
such mutant cells are well known in the art and are exemplified in the Examples below. 
Examples of such mutants are Rl 14 or R1534. Preferably, the mutant cell is Rl 14. 

In this method, the polynucleotide sequence is SEQ ID NO:42, or the following residues of 
15 SEQ ID NO:42: 2622 to 3644, 3641 to 4690, 4687 to 5853, 5834 to 6970, 6970 to 7887, 
7880 to 8878, as well as residues 59-292, 295-1158 or 1185-1610 of SEQ ID NO:157 and 
residues 1-1170 or 1258-1980 of SEQ ID NO:177. These sequences maybe used in this 
method alone or as part of an expression vector. These sequences also may optionally be 
operatively linked to expression control sequence(s). In this method, combinations of the 
20 polynucleotide sequences identified herein may be used, such as for example, SEQ ID 
NO:42, SEQ ID NO: 157, and SEQ ID NO: 177. 

Examples of expression vector that may be selected for use in this method include pBBR- ^ 
K-mev-opl6-l, pBBR-K-mev-opl6-2, pDS-mraA, pDS-uft, pDS-hcs, pDS-mv/:, pDS-pmfc, 
pDS-mvd, pDS-His-mvoA, pDS-His-ufi, pDS-His-fics, pDS-His-mvfc, pDS-His-pmfc, 
pDS-His-mvd, pBBR-K-Zea4, pBBR-K-Zea4-up, pBBR-K-Zea4-down, pBBR-K-Pcrf£- 
crtE-3, pBBR-tK-Pcrr£-mvaA, pBBR-tK-PcrfMi, pBBR-tK-Pcrf£-/ics, pBBR-tK-PcrfE- 
mvJt, pBBR-tK-Pcrf£-pmlt, pBBR-tK-Pa*£-mvd, pBBR-K-Pcrf£-mvaA-ot£-3, pDS-His- 
phaA y pBBR-K-Pcrff-crtW, pBBR-K-Pcrt£-crtWZ, pBBR-K-Pcrt£-crtZW, and 
combinations thereof. 

In this method, the polynucleotide sequence is introduced into the cell using any con- 
ventional means. Examples of suitable methods for introducing a polynucleotide sequence 
into a cell include transformation, transduction, transfection, lipofection, electroporation 
[see e.g., Shigekawa and Dower, Biotechniques 6:742-751 (1988)], conjugation [see e.g., 
Koehler and Thorne, Journal of Bacteriology 169:5771-5278 (1987)], and biolistics. 
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The use of conjugation to transfer a polynucleotide sequence, such as in the form of an 
expression vector, into recipient bacteria is generally effective, and is a well-known pro- 
cedure, (e.g. US 5,985,623). Depending on the strain of bacteria, it may be more common 
to use transformation of competent cells with purified DNA. 

5 Known electroporation techniques (both in vitro and in vivo) function by applying a brief 
high voltage pulse to electrodes positioned around the treatment region, (e.g. 
US 6,208,893). The electric field generated between the electrodes causes the cell mem- 
branes to temporarily become porous, whereupon molecules of the implant agent enter 
the cells. In known electroporation applications, this electric field comprises a single 
10 square wave pulse on the order of 1000 V/cm of about 100 us duration. Such a pulse may 
be generated, for example, in known applications of the Electro Square Porator T820, 
made by the BTX Division of Genetronics, Inc. 

Biolistics is a system for delivering polynucleotides into a target cell using microprojectile 
bombardment techniques. An illustrative embodiment of a method for delivering poly- 

15 nucleotides into target cells by acceleration is a Biolistics Particle Delivery System, which 
can be used to propel particles coated with DNA or cells through a screen, such as a stain- 
less steel or Nytex screen, onto a filter surface covered with cultured target cells. The screen 
disperses the particles so that they are not delivered to the target cells in large aggregates. It 
is believed that a screen intervening between the projectile apparatus and the cells to be 

20 bombarded reduces the size of projectiles aggregate and may contribute to a higher 
frequency of transformation by reducing damage inflicted on the recipient cells by 
projectiles that are too large. 

For the bombardment, cells in suspension are preferably concentrated on filters or solid 
culture medium. Alternatively, other target cells may be arranged on solid culture 

25 medium. The cells to be bombarded are positioned at an appropriate distance below the 
microprojectile stopping plate. If desired, one or more screens are also positioned between 
the acceleration device and the cells to be bombarded. Through the use of these well- 
known techniques one may obtain up to 1000 or more foci of cells transiently expressing a 
marker gene. The number of cells in a focus which express the exogenous gene product 48 

30 hours post-bombardment often range from 1 to 10 and average 1 to 3. 

In bombardment transformation, one may optimize the prebombardment culturing con- 
ditions and the bombardment parameters to yield the maximum numbers of stable trans- 
formants. Both the physical and biological parameters for bombardment are important in 
this technology. Physical factors are those that involve manipulating the polynucleotide/- 
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microprojectile precipitate or those that affect the flight and velocity of either the macro- 
or microprojectiles. Biological factors include all steps involved in manipulation of cells 
before and immediately after bombardment, the osmotic adjustment of target cells to help 
alleviate the trauma associated with bombardment, and also the nature of the transforming 
5 DNA, such as linearized DNA or intact supercoiled plasmids. 

Accordingly, it is contemplated that one may wish to adjust various of the bombardment 
parameters in small-scale studies to fully optimize the conditions. One may particularly 
wish to adjust physical parameters such as gap distance, flight distance, tissue distance, and 
helium pressure. One may also minimize the trauma reduction factors (TRFs) by modify- 
10 ing conditions which influence the physiological state of the recipient cells and which may 
therefore influence transformation and integration efficiencies. For example, the osmotic 



state, tissue hydration and the subculture stage or cell cycle of the recipient cells may be 
adjusted for optimum transformation. The execution of other routine adjustments will be 
known to those of skill in the art in light of the present disclosure. 

15 The methods of particle-mediated transformation is well known to those of skill in the art. 
E.g. US 5,015,580 (specifically incorporated herein by reference) describes the transforma- 
tion of soybeans using such a technique. 

Another embodiment of the invention is a method for engineering a bacterium to produce 
an isoprenoid compound. Such a bacterium is made by (a) culturing a parent bacterium 

20 in a medium under conditions permitting expression of an isoprenoid, and selecting a 

mutant bacterium from the culture medium that produces about 1.1-1,000 times more of 
an isoprenoid than the parent bacteria; (b) introducing into the mutant bacterium an ex- ^ 
pression vector containing a polynucleotide sequence represented by SEQ ID NO:42 
operably linked to an expression control sequence; and (c) selecting a bacterium that con- 

25 tains the expression vector and produces at least about 1.1 times more of an isoprenoid 
than the mutant in step (a). 

In this embodiment, an isoprenoid compound means a compound structurally based on 
isopentenyl diphosphate (IPP) units of the formula: 



<^ v OPP 

30 Such compounds include the hemiterpenes, monoterpenes, sesquiterpenes, diterpenes, 

triterpenes (e.g., phytosterols, phytoestrogens, phytoecdysones, estrogens, phytoestrogens), 




• 
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tetraterpenes (carotenoids), and polyterpenes. Preferably, the isoprenoid is a carotenoid, 
such as for example, one of the carotenoids identified above, in particular zeaxanthin. 

The bacterium may be any bacterium that is capable of producing an isoprenoid com- 
pound using the processes disclosed herein. Preferably, the bacterium is a Paracoccus, 
5 Flavobacterium, Agrobacterium, Alcaligenes, Erwinia, E. colt, or B. subtilis. Even more pre- 
ferably, the bacterium is a Paracoccus. Preferably, the parent bacterium is R-1506 or 
R 1512, and the mutant bacterium is R1534 or R114, preferably Rl 14. 

The bacterium is cultured in a media and under conditions that are optimized for the pro- 
duction of the isoprenoid. The selection of media and culture conditions are well within 
the skill of the art. The assays set forth in Examples 1,11, and 12 provide exemplary 
methods for measuring the presence of certain carotenoids in a culture media. By 
optimizing the culture conditions and measuring for the production of the target 
isoprenoid, the culturing and selection of a mutant that meets the specific production 
parameters recited herein may be met. In this way, a mutant bacterium producing from 
about 1.1-1,000 times more of an isoprenoid than the parent bacterium may be selected. 
Preferably, the mutant bacterium produces from about 1.5-500 times more of an 
isoprenoid than the parent bacterium, such as for example, at least about 100 times or at 
least about 10 times more of an isoprenoid than the parent bacterium. That bacterium is 
then cultured and used in subsequent steps. 

After selecting the mutant bacterium that produces the desired level of an isoprenoid, an 
expression vector is introduced into the bacterium using any of the methods set forth 
above or described in the examples. Any of the expression vectors defined herein may be 
introduced into the mutant cell. Preferably, the expression vector contains SEQ ID NO:42. 

Once the expression vector is introduced into the mutant bacteria, a stable transformant is 
25 selected that produces at least about 1.1 times, such as about 5 to about 20 times, more of 
an isoprenoid than the untransformed mutant. The selected transformant is then cultured 
under conditions suitable for isoprenoid production, and then the isoprenoid is isolated 
from the cell or the culture medium. 

A further step in this method is introducing a mutation into the mutant bacterium that 
30 results in the increased production of an isoprenoid compound by the bacterium. The 
mutation may be selected from at least one of the following: inactivating the poly- 
hydroxyalkanoate (PHA) pathway, increasing expression of acetyl-CoA acetyltransferase, 
increasing expression of FPP synthase, increasing expression of an enzyme in a carotenoid 
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biosynthetic pathway, and increasing the expression of an enzyme for converting isopent- 
enyl diphosphate (IPP) to dimethylallyl diphosphate (DMAPP). 

The inactivating of the PHA pathway may be achieved by selecting for a mutant bacterium 
that does not express a polypeptide encoded by phaB (nucleotide positions 1258-1980 of 
5 SEQ ID NO: 177) or by disrupting expression of the wild type phaB gene by homologous 
recombination using SEQ ID NO: 177 or fragments thereof. 

In this method, increasing expression of acetyl-CoA acetyltransferase may be achieved by 
introducing into the mutant bacterium a vector containing a polynucleotide sequence re- 
presented by SEQ ID NO: 175 or nucleotide positions 1-1170 of SEQ ID NO: 177 operably 

10 linked to an expression control sequence. In this method, increasing expression of FPP 
synthase may be achieved by introducing into the mutant bacterium a vector containing 
polynucleotide sequence represented by nucleotides 295-1158 of SEQ ID NO: 157 operably 
linked to an expression control sequence. In this method, increasing expression of a caro- 
tenoid gene may be achieved by introducing into the mutant bacterium a vector com- 

15 prising a polynucleotide sequence that encodes one or more enzymes in the carotenoid 
biosynthetic pathway, such as for example a polynucleotide sequence selected from the 
group consisting of SEQ ID NOs:180, 182, and 184 operably linked to an expression con- 
trol sequence. 

In this method, it is preferred that the isoprenoid compound is isopentenyl diphosphate 
20 (IPP). It is also preferred that the isoprenoid compound is a carotenoid, such as for 

example, phytoene, lycopene, P-carotene, zeaxanthin, canthaxanthin, astaxanthin, adoni- 
xanthin, cryptoxanthin, echinenone, adonirubin, and combinations thereof. V 

Another embodiment of the invention is a microorganism of the genus Paracoccus 9 which 
microorganism has the following characteristics: (a) a sequence similiarity to SEQ ID 

25 NO: 12 of >97% using a similarity matrix obtained from a homology calculation using 
GeneCompar v. 2.0 software with a gap penalty of 0%; (b) a homology to R-1512, R1534, 
R114 or R-1506 of >70% using DNA:DNA hybridization at 81.5°C; (c) a G+C content of 
its genomic DNA that varies less than 1% from the G+C content of the genomic DNA of 
R114, R-1512, R1534, and R-1506; and (d) an average DNA fingerprint that clusters at 

30 about 58% similarity to strains R-1512, R1534, R114 and R-1506 using the AFLP proce- 
dure of Example 2, with the proviso that the microorganism is not Paracoccus sp. 
(MBIC3966). 
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Methods for determining each of these characteristics are fully set forth in Example 2, and 
it is contemplated when these methods are used that microorganisms meeting the above 
criteria will be readily identifiable. It is preferred that a microorganism of the present in- 
vention have each characteristic set forth above (i.e., a-d). However, any combination of 
5 the characteristics a-d, which provides sufficient information to taxonomically validly 
describe a microorganism belonging to the same species as Rl 14, R-1512, R1534, and R- 
1506, with the exception of Paracoccus sp. (MBIC3966) is also within the scope of the 
invention. 

Another embodiment of the invention is a microorganism of the genus Paracoccus, which 
10 microorganism has the following characteristics: (a) 18:lw7c comprising at least about 
75% of the total fatty acids of the cell membranes; (b) an inability to use adonitol, i-ery- 

thritol, gentiobiose, (5-methylglucoside, D-sorbitol, xylitol and quinic acid as carbon 
sources for growth; and (c) an ability to use L-asparagine and L-aspartic acid as carbon 
sources for growth, with the proviso that the microorganism is not Paracoccus sp. 
15 (MBIC3966). 

Methods for determining each of these characteristics are also fully set forth in Example 2, 
and it is contemplated when these methods are used that microorganisms meeting the 
above criteria will be readily identifiable. It is preferred that a microorganism of the pre- 
sent invention have each characteristic set forth above (i.e., a-c). However, any combina- 
20 tion of the characteristics a-c, which provides sufficient information to taxonomically 
validly describe a microorganism belonging to the same species as R114, R-1512, R1534, 
and R-1506, with the exception of Paracoccus sp. (MBIC3966) is also within the scope of 
the invention. 

Another embodiment of the invention is a microorganism of the genus Paracoccus, which 
25 microorganism has the following characteristics: (a) an ability to grow at 40°C; (b) an 

ability to grow in a medium having 8% NaCl; (c) an ability to grow in a medium having a 
pH of 9.1; and (d) a yellow-orange colony pigmentation, with the proviso that the micro- 
organism is not Paracoccus sp. (MBIC3966). 

Methods for determining each of these characteristics are also fully set forth in Example 2, 
30 and it is contemplated when these methods are used that microorganisms meeting the 
above criteria will be readily identifiable. It is preferred that a microorganism of the pre- 
sent invention have each characteristic set forth above (i.e., a-d). However, any combina- 
tion of the characteristics a-d, which provides sufficient information to taxonomically 
validly describe a microorganism belonging to the same species as Rl 14, R-1512, R1534, 
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and R-1506, with the exception of Paracoccus sp. (MBIC3966) is also within the scope of 
the invention. 

A microorganism of the present invention may also be identified using any combination of 
the 11 characteristics set forth above, which provide sufficient information to taxonomi- 
5 cally validly describe a microorganism belonging to the same species as Rl 14, R-1512, 
R1534, and R-1506, with the exception of Paracoccus sp. (MBIC3966). 

In accordance with the foregoing the present invention provides 

(1) an isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 340 of SEQ ID NO:43, in particular 
an amino acid sequence corresponding to positions 68-97 of SEQ ID NO:43; 1 

(b) an amino acid sequence shown as residues 1 to 349 of SEQ ID NO:45, in particular 
an amino acid sequence corresponding to positions 1-30 of SEQ ID NO:45; 

(c) an amino acid sequence shown as residues 1 to 388 of SEQ ID NO:47, in particular 
an amino acid sequence corresponding to positions 269-298 of SEQ ID NO:47; 

(d) an amino acid sequence shown as residues 1 to 378 of SEQ ID NO:49> in particular 
an amino acid sequence corresponding to positions 109-138 of SEQ ID NO:49; 

(e) an amino acid sequence shown as residues 1 to 305 of SEQ ID NO:51, in particular 
an amino acid sequence corresponding to positions 198-227 of SEQ ID NO:51; 

(f) an amino acid sequence shown as residues 1 to 332 of SEQ ID NO:53, in particular 
an amino acid sequence corresponding to positions 81-110 of SEQ ID NO:53; 

(g) a fragment of an amino acid sequence selected from the group consisting of SEQ 
ID NOs: 43, 45, 47, 49, 51, and 53, wherein said fragment has at least 30 contiguous ^ 
amino acid residues; 

(h) an amino acid sequence of a fragment of a polypeptide selected from the group 
consisting of SEQ ID NOs: 43, 45, 47, 49, 51, and 53, the fragment having the activity 
of hydroxymethylglutaryl-CoA reductase (HMG-CoA reductase), isopentenyl diphos- 
phate isomerase, hydroxymethylglutaryl-CoA synthase (HMG-CoA synthase), 
mevalonate kinase, phosphomevalonate kinase, or diphosphomevalonate decarboxyl- 
ase; 

(i) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybri- 
dizes under stringent conditions to a hybridization probe comprising at least 30 con- 
secutive nucleotides of SEQ ID NO:42 or a complement of SEQ ID NO:42, wherein 
the polypeptide has the activity of HMG-CoA reductase, isopentenyl diphosphate 
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isomerase, HMG-CoA synthase, mevalonate kinase, phosphomevalonate kinase, or 
diphosphomevalonate decarboxylase; and 

(j) a conservatively modified variant of SEQ ID NO:43, 45, 47, 49, 51 or 53. 
(2) an isolated polypeptide comprising an amino acid sequence selected from the group 
5 consisting of: 

(a) an amino acid sequence shown as residues 1 to 287 of SEQ ID NO: 159; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO:159; 

(c) an amino acid sequence of a fragment of SEQ ID NO: 159, the fragment having the 
activity of farnesyl diphosphate synthase (FPP synthase); 

10 (d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybri- 

dizes under stringent conditions to a hybridization probe comprising at least 30 con- 
(T secutive nucleotides spanning positions 295-1 158 of SEQ ID NO:157 or a complement 

thereof, wherein the polypeptide has the activity of FPP synthase; and 
(e) a conservatively modified variant of SEQ ID NO: 159. 
15 (3) an isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 142 of SEQ ID NO: 160; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO.160; 

(c) an amino acid sequence of a fragment of SEQ ID NO: 160, the fragment having the 
20 activity of 1 -deoxyxyIulose-5-phosphate synthase (DXPS); 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybri- 
dizes under stringent conditions to a hybridization probe comprising at least 30 con- 
secutive nucleotides spanning positions 1185-1610 of SEQ ID NO:157 or a comple- 
ment thereof, wherein the polypeptide has the activity of DXPS; 

25 (e) a conservatively modified variant of SEQ ID NO: 160. 

(4) an isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 390 of SEQ ID NO: 178; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO:178; 

30 (c) an amino acid sequence of a fragment of a polypeptide of SEQ ID NO: 178, the 

fragment having the activity of acetyl- CoA acetyl transferase; - 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybri- 
dizes under stringent conditions to a hybridization probe comprising at least 30 con- 
secutive nucleotides spanning positions 1-1170 of SEQ ID NO:177 or a complement 

35 thereof, wherein the polypeptide has the activity of acetyl-CoA acetyltransferase; and 

(e) a conservatively modified variant of SEQ ID NO: 178. 
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(5) an isolated polypeptide comprising an amino acid sequence selected from the group 

consisting of: 

(a) an amino acid sequence shown as residues 1 to 240 of SEQ ID NO:179; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO:179; 

5 (c) an amino acid sequence of a fragment of a polypeptide of SEQ ID NO: 1 79, the 

fragment having the activity of acetoacetyl-CoA reductase; 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybri- 
dizes under stringent conditions to a hybridization probe comprising at least 30 con- 
secutive nucleotides spanning positions 1258-1980 of SEQ ID NO:177 or a comple- 

10 ment thereof, wherein the polypeptide has the activity of acetoacetyl-CoA reductase; 

and 

(e) a conservatively modified variant of SEQ ID NO:179. ( 

(6) an isolated polynucleotide sequence comprising a nucleotide sequence selected from 

the group consisting of SEQ ID NO:42, variants of SEQ ID NO:42 containing one or 
15 more substitutions according to the Paracoccus sp. strain R1534 codon usage table 

(Table 14), fragments of SEQ ID NO:42 that encode a polypeptide having an activity 
selected from the group consisting of hydroxymethylglutaryl-CoA reductase (HMG- 
CoA reductase) , isopentenyl diphosphate isomerase, hydroxymethylglutaryl-CoA 
synthase (HMG-CoA synthase), mevalonate kinase, phosphomevalonate kinase, and 
20 diphosphomevalonate decarboxylase, and polynucleotide sequences that hybridize 

under stringent conditions to a hybridization probe the nucleotide sequence of which 
consists of at least 30 contiguous nucleotides of SEQ ID NO:42, or the complement of 
SEQ ID NO:42, which polynucleotide encodes a polypeptide having an activity 
selected from the group consisting of HMG-CoA reductase, isopentenyl diphosphate 
25 isomerase, HMG-CoA synthase, mevalonate kinase, phosphomevalonate kinase, and 

diphosphomevalonate decarboxylase; in particular 

(a) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of nucleotides 2622 to 3644 of SEQ ID NO:42, fragments 
thereof that encode a polypeptide having HMG-CoA reductase activity, and poly- 

30 nucleotide sequences that hybridize under stringent conditions to a hybridization 

probe the nucleotide sequence of which consists of at least 30 contiguous nucleotides 
spanning residues 2622 to 3644 of SEQ ID NO:42, or a complement thereof, wherein 
the polynucleotide encodes a polypeptide having HMG-CoA reductase activity, more 
particularly a polynucleotide sequence consisting of nucleotides 2622 to 3644 of SEQ 

35 ID NO:42; 

(b) an isolated polynucleotide sequence comprising a polynucleotide sequence 
selected from the group consisting of nucleotides 3641 to 4690 of SEQ ID NO:42, 
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variants thereof containing one or more substitutions according to the Paracoccus sp. 
strain R1534 codon usage table (Table 14), fragments of SEQ ID NO:42 that encode a 
polypeptide having isopentenyl diphosphate isomerase activity, and polynucleotide 
sequences that hybridize under stringent conditions to a hybridization probe the 
nucleotide sequence of which consists of at least 30 contiguous nucleotides spanning 
residues 3641 to 4690 of SEQ ID NO:42, or a complement thereof, wherein the poly- 
nucleotide encodes a polypeptide having isopentenyl diphosphate isomerase activity, 
more particularly a polynucleotide sequence consisting of nucleotides 3641 to 4690 of 
SEQ ID NO:42. 

(c) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of nucleotides 4687 to 5853 of SEQ ID NO:42, variants 
thereof containing one or more substitutions according to the Paracoccus sp. strain 
R1534 codon usage table (Table 14), fragments of SEQ ID NO:42 that encode a poly- 
peptide having HMG-CoA synthase activity, and polynucleotide sequences that hybri- 
dize under stringent conditions to a hybridization probe the nucleotide sequence of 
which consists of at least 30 contiguous nucleotides spanning residues 4687 to 5853 of 
SEQ ID NO:42, or a complement thereof, wherein the polynucleotide encodes a poly- 
peptide having HMG-CoA synthase activity, more particularly a polynucleotide 
sequence consisting of nucleotides 3641 to 4690 of SEQ ID NO:42; 

(d) an isolated polynucleotide sequence comprising a polynucleotide sequence 
selected from the group consisting of nucleotides 5834 to 6970 of SEQ ID NO:42, 
variants thereof containing one or more substitutions according to the Paracoccus sp. 
strain R1534 codon usage table (Table 14), fragments of SEQ ID NO:42 that encode a 
polypeptide having mevalonate kinase activity, and polynucleotide sequences that 
hybridize under stringent conditions to a hybridization probe the nucleotide sequence 
of which consists of at least 30 contiguous nucleotides spanning residues 5834 to 6970 
of SEQ ID NO:42, or a complement thereof, wherein the polynucleotide encodes a 
polypeptide having mevalonate kinase activity, more particularly a polynucleotide 
sequence consisting of nucleotides 3641 to 4690 of SEQ ID NO:42; 

(e) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of nucleotides 6970 to 7887 of SEQ ID NO:42, variants 
thereof containing one or more substitutions according to the Paracoccus sp. strain 
R1534 codon usage table (Table 14), fragments of SEQ ID NO:42 that encode a poly- 
peptide having phosphomevalonate kinase activity, and polynucleotide sequences that 
hybridize under stringent conditions to a hybridization probe the nucleotide sequence 
of which consists of at least 30 contiguous nucleotides spanning residues 6970 to 7887 
of SEQ ID NO:42, or a complement thereof, wherein the polynucleotide encodes a 
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polypeptide having phosphomevalonate kinase activity, more particularly a poly- 
nucleotide sequence consisting of nucleotides 3641 to 4690 of SEQ ID NO:42; or 
(f) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of nucleotides 7880 to 8878 of SEQ ID NO:42, variants 
thereof containing one or more substitutions according to the Paracoccus sp. strain 
R1534 codon usage table (Table 14), fragments of SEQ ID NO:42 that encode a poly- 
peptide having diphosphomevalonate decarboxylase activity, and polynucleotide 
sequences that hybridize under stringent conditions to a hybridization probe the 
nucleotide sequence of which consists of at least 30 contiguous nucleotides spanning 
residues 7880 to 8878 of SEQ ID NO:42, or a complement thereof, wherein the poly- 
nucleotide encodes a polypeptide having diphosphomevalonate decarboxylase activity, 
more particularly an isolated polynucleotide consisting of nucleotides 7880 to 8878 of ( 
SEQ ID NO:42, more particularly a polynucleotide sequence consisting of nucleotides 
3641 to 4690 of SEQ ID NO:42; 
(7) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of the nucleotide sequence of SEQ ID NO: 157, variants of 
SEQ ID NO: 157 containing one or more substitutions according to the Paracoccus sp. 
strain R1534 codon usage table (Table 14), fragments of SEQ ID NO:157 that encode 
a polypeptide having farnesyl diphosphate (FPP) synthase activity, 1-deoxy-D- 
xylulose 5-phosphate synthase activity or a polypeptide having the activity of XseB, 
and polynucleotide sequences that hybridize under stringent conditions to a 
hybridization probe the nucleotide sequence of which consists of at least 30 
contiguous nucleotides of SEQ ID NO:157, or the complement of SEQ ID NO:157, 
which polynucleotide encodes a polypeptide having an activity selected from the ^ 
group consisting of FPP synthase activity, 1-deoxy-D-xylulose 5-phosphate synthase 
activity, and the activity of XseB, in particular 

(a) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of a nucleotide sequence spanning positions 59-292 of SEQ 
ID NO: 157, variants thereof containing one or more substitutions according to the 
Paracoccus sp. strain R1534 codon usage table (Table 14), fragments of the nucleotide 
sequence spanning positions 59-292 of SEQ ID NO: 157 that encode a polypeptide 
having the function of XseB, and polynucleotide sequences that hybridize under 
stringent conditions to a hybridization probe the nucleotide sequence of which con- 
sists of at least 30 contiguous nucleotides spanning positions 59-292 of SEQ ID 
NO: 157, or the complement of such a sequence, wherein the polynucleotide encodes a 
polypeptide having the function of XseB, more particularly an isolated polynucleotide 
consisting of nucleotides 59 to 292 of SEQ ID NO:157; 
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(b) an isolated polynucleotide sequence comprising a polynucleotide sequence 
selected from the group consisting of the nucleotide sequence spanning positions 295- 
1 158 of SEQ ID NO: 157, variants of the nucleotide sequence spanning positions 295- 
1 158 of SEQ ID NO:157 containing one or more substitutions according to the Para- 
5 coccus sp. strain R1534 codon usage table (Table 14), fragments of the nucleotide 

sequence spanning positions 295-1 158 of SEQ ID NO: 157 that encode a FPP synthase 
activity, and polynucleotide sequences that hybridize under stringent conditions to a 
hybridization probe the nucleotide sequence of which consists of at least 30 contigu- 
ous nucleotides spanning positions 295-1 158 of SEQ ID NO: 157, or the complement 
10 of such a sequence, wherein the polynucleotide encodes a polypeptide having FPP 

synthase activity, more particularly an isolated polynucleotide consisting of nucleo- 
tides 295 to 1158 of SEQ ID NO: 157; 

(c) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of the nucleotide sequence spanning positions 1 185-1610 of 

15 SEQ ID NO:157, variants of the nucleotide sequence spanning positions 1185-1610 of 

SEQ ID NO: 157 containing one or more substitutions according to the Paracoccus sp. 
strain R1534 codon usage table (Table 14), fragments of the nucleotide sequence 
spanning positions 1 185-1610 of SEQ ID NO:157 that encode a polypeptide having 
l-deoxyxylulose-5-phosphate synthase activity, and polynucleotide sequences that 

20 hybridize under stringent conditions to a hybridization probe the nucleotide sequence 

of which consists of at least 30 contiguous nucleotides spanning positions 1185-1610 
of SEQ ID NO: 157, or the complement of such a sequence, wherein the polynucleo- 
tide encodes a polypeptide having l-deoxyxylulose-5-phosphate synthase activity, 
more particularly an isolated polynucleotide consisting of nucleotides 1 185 to 1610 of 

25 SEQ ID NO: 157; 

(8) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of the nucleotide sequence of SEQ ID NO:177, variants of 
SEQ ID NO: 177 containing one or more substitutions according to the Paracoccus sp. 
strain R1534 codon usage table (Table 14), fragments of SEQ ID NO:177 that encode 

30 a polypeptide having an activity selected from the group consisting of acetyl-CoA 

acetyltransferase and acetoacetyl-CoA reductase, and polynucleotide sequences that 
hybridize under stringent conditions to a hybridization probe the nucleotide sequence 
of which consists of at least 30 contiguous nucleotides of SEQ ID NO: 177, or the 
complement of SEQ ID NO: 177, which polynucleotide encodes a polypeptide having 

i5 an activity selected from the group consisting of acetyl-CoA acetyltransferase and 

acetoacetyl-CoA reductase, in particular 
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(a) an isolated polynucleotide sequence comprising a polynucleotide sequence selected 
from the group consisting of nucleotides 1 to 1 170 of SEQ ID NO:177, variants of 
SEQ ID NO: 177 containing one or more substitutions according to the Paracoccus sp. 
strain R1534 codon usage table (Table 14), fragments of SEQ ID NO:177 that encode a 

5 polypeptide having acetyl-CoA acetyltransferase activity, and polynucleotide 

sequences that hybridize under stringent conditions to a hybridization probe the 
nucleotide sequence of which consists of at least 30 contiguous nucleotides spanning 
residues 1 to 1170 of SEQ ID NO: 177, or a complement thereof, wherein the 
polynucleotide encodes a polypeptide having acetyl-CoA acetyltransferase activity, 

0 more particularly an isolated polynucleotide sequence consisting of nucleotides 1- 

1170 of SEQ ID NO: 177; 

(b) an isolated polynucleotide sequence comprising a polynucleotide sequence ( 
selected from the group consisting of nucleotides 1258-1980 of SEQ ID NO: 177, 
variants of SEQ ID NO: 177 containing one or more substitutions according to the 

5 Paracoccus sp. strain R1534 codon usage table (Table 14), fragments of SEQ ID 

NO: 177 that encode a polypeptide having acetoacetyl-CoA reductase activity, and 
polynucleotide sequences that hybridize under stringent conditions to a hybridization 
probe the nucleotide sequence of which consists of at least 30 contiguous nucleotides 
spanning residues 1258-1980 of SEQ ID NO:177, or a complement thereof, wherein 
the polynucleotide encodes a polypeptide having acetoacetyl-CoA reductase activity, 
more particularly an isolated polynucleotide sequence consisting of nucleotides 1258- 
1980 of SEQ ID NO:177; 

(9) an isolated polynucleotide sequence comprising a nucleotide sequence selected from 
the group consisting of SEQ ID NO:42, SEQ ID NO: 157, SEQ ID NO: 177, and ( 
combinations thereof; 

(10) an expression vector comprising the polynucleotide sequence of anyone of (6) (a) to 
(6) (f), (7) (a) to (7) (c), (8) (a), (8) (b) or (9), in particular an expression vector 
wherein the polynucleotide sequence is operably linked to an expression control 
sequence, e.g. an expression vector further comprising a polynucleotide sequence that 
encodes an enzyme in the carotenoid biosynthetic pathway, more particularly an 
expression vector wherein the polynucleotide sequence is selected from the group 
consisting of SEQ ID NO: 180, SEQ ID NO: 182, SEQ ID NO: 184, and combinations 
thereof which are operably linked to an expression control sequence; 

( 1 1) an expression vector selected from the group consisting of pBBR-K-mev-opl6-l, 
pBBR-K-mev-opl6-2, pDS-mvaA, pDS-zdz, pDS-hcs 9 pDS-mvfc, pDS-pmJt, pDS-mvd, 
pDS-His-mvaA, pDS-His-ufc, pDS-His-/ics, pDS-His-mvjk, pDS-His-pmJfc, pDS-His- 
mvd y pBBR-K-Zea4, pBBR-K-Zea4-up, pBBR-K-Zea4-down, pBBR-K-Pcrf£-crt£-3, 
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P BBR-tK-Pcrf£-mvaA, pBBR-tK-PcrtE-idi, pBBR-tK-Pcrf£- fccs, pBBR-tK-Pcrt£-mvfc, 
pBBR-tK-PcrtE-pmk, pBBR-tK-Pcrf£-mvrf, pBBR-K-Pa-r£-mvayl-crf£-3, pDS-His- 
phaA, pBBR-K-Pcrf£-a?W, pBBR-K-Pcrf£-crrWZ, pBBR-K-Pm£-crrZW, and 
combinations thereof, in particular 

5 (a) an expression vector selected from the group consisting of pBBR-K-mev-opl6-l 

and pBBR-K-mev-opl6-2, 

(b) an expression vector selected from the group consisting of pBBR-K-Zea4, pBBR- 
K-Zea4-up, and pBBR-K-Zea4-down; 

(c) an expression vector selected from the group consisting of pBBR-K-Pcrf E-crt£- 3, 
10 pBBR-tK-Pcr*£-mvaA, pBBR-tK-Pcrf£-u«, pBBR-tK-Pcr*£-/ics, pBBR-tK-Pcrf£-mvA:, 

pBBR-tK-Pcrf£-p™£, pBBR-tK-Pa*£-mv<f, and combinations thereof; 

(d) an expression vector which is pBBR-K-Pm£-mvaA-crf£-3; 

(e) an expression vector which is pDS-His-p/iaA; or 

(f) an expression vector selected from the group consisting of pBBR-K-Pcrt£-crtW, 
15 pBBR-K-Pcrf£-crtWZ, and pBBR-K-Pcrf£-o*ZW; 

( 12) a cultured cell comprising the polynucleotide sequence of any one of (6) (a) to (f), (7) 

(a) to (c), (8) (a), (8) (b) or (9), or an expression vector of (10) or (11), or a progeny 
of the cell, wherein the cell expresses a polypeptide encoded by the polynucleotide 
sequence, in particular a cell which is further characterized by a feature selected from 

20 (a) further comprising a polynucleotide sequence that encodes an enzyme in the 

carotenoid biosynthetic pathway, more particularly a cultured cell wherein the poly- 
nucleotide sequence that encodes an enzyme in the carotenoid biosynthetic pathway is 
selected from the group consisting of SEQ ID NOs:180, 182, and 184, or a progeny of 
the cell, wherein the cell expresses polypeptides encoded by the polynucleotide 

25 sequences, and 

(b) from being a member of a group selected from yeast, fungus, bacterium and alga, 
in particular a bacterium selected from the group consisting of Paracoccus, Flavo- 
bacterium* Agrobacterium, Alcaligenes, Erwinia, £. coli, and B. subtilis, more particularly 
Paracoccus, more particularly Paracoccus selected from the group consisting of R- 1506, 

30 R-1512, R1534, and Rl 14; 

(13) a method of producing a carotenoid comprising culturing a cell of (12) under condi- 
tions permitting expression of a polypeptide encoded by the polynucleotide sequence, 
and isolating the carotenoid from the cell or the medium of the cell; 

(14) a method of making a carotenoid-producing cell comprising: 

35 (a) introducing into a cell a polynucleotide sequence encoding an enzyme in the 

mevalonate pathway, which enzyme is expressed in the cell; and 
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(b) selecting a cell containing the polynucleotide sequence of step (a) that produces a 
carotenoid at a level that is about 1.1-1,000 times the level of the carotenoid produced 
by the cell before introduction of the polynucleotide sequence, 
in particular a method selected from a method characterized by a feature selected 
5 from 

(i) the selecting step comprising selecting a cell containing the polynucleotide 
sequence of step (a) that produces a carotenoid at a level that is about 1.5-500 times, 
particularly about 100 times, or at least about 10 times, the level of the carotenoid pro- 
duced by the cell before introduction of the polynucleotide sequence; 
10 (ii) the cell producing from about 1 mg/L to about 10 g/L of a carotenoid. 

(iii) the cell being selected from the group consisting of a yeast, fungus, bacterium, 
and alga, in particular selected from the group consisting of Paracoccus, Flavobacteri-( 
um, Agrobacteriurtij Alcaligenes, Erwinia, E. coli y and B. subtili$ y more particularly from 
Paracoccus; 

15 (iv) the cell in step (a) being a mutant cell, in particular being selected from the group 

consisting of R114 and R1534, in particular the mutant cell producing about 1.1-1,000 
times, in particular about 1.5-500 times, more particularly at least about 100 times 
more or at least about 10 times more, the level of a carotenoid compared to its non- 
mutant parent; 

20 (v) the polynucleotide sequence being selected from polynucleotide sequences of (6) 

(a) to (f)> (7) (a) to (c), (8) (a), (8) (b) and (9), in particular wherein the polynucleo- 
tide sequence is operably linked to an expression control sequence; 

(vi) the polynucleotide sequence being an expression vector of (10) or (11); 

(vii) the introducing step being selected from the group consisting of transformation^ 
25 transduction, transfection, lipofection, electroporation, conjugation, and biolistics. 

(viii) the carotenoid being selected from the group consisting of phytoene, lycopene, 

{3-carotene, zeaxanthin, canthaxanthin, astaxanthin, adonixanthin, cryptoxanthin, 
echinenone, adonirubin, and combinations thereof, in particular the carotenoid being 
zeaxanthin; 

30 (15) a method for engineering a bacterium to produce an isoprenoid compound com- 
prising: 

(a) culturing a parent bacterium in a medium under conditions permitting expression 
of an isoprenoid compound, and selecting a mutant bacterium from the culture 
medium that produces about LI- 1,000 times more of an isoprenoid compound than 
35 the parent bacterium; 
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(b) introducing into the mutant bacterium an expression vector comprising a poly- 
nucleotide sequence represented by SEQ ID NO:42 operably linked to an expression 
control sequence; and 

(c) selecting a bacterium that contains the expression vector and produces at least 
about 1.1 times more of an isoprenoid compound than the mutant in step (a), in 
particular 

(i) a method further comprising introducing a mutation into the mutant bacterium, 
more particularly a method wherein the mutation causes an effect selected from at 
least one of the following: inactivating the polyhydroxyalkanoate (PHA) pathway, in- 
creasing expression of acetyl-CoA acetyltransferase, increasing expression of farnesyl 
diphosphate (FPP) synthase, increasing expression of an enzyme in a carotenoid path- 
way, increasing the expression of an enzyme for converting isopentenyl diphosphate 
(IPP) to dimethylallyl diphosphate (DMAPP), 

most particularly a method wherein inactivating of the PHA pathway comprises 
selecting for a mutant bacterium that does not express a polypeptide encoded by phaB 
(nucleotide positions 1258-1980 of SEQ ID NO: 177) or by disrupting expression of 
the wild type phaB gene by homologous recombination using SEQ ID NO: 177 or a 
fragment thereof, or 

a method wherein increasing expression of acetyl-CoA acetyltransferase comprises 
introducing into the mutant bacterium a vector comprising a polynucleotide sequence 
represented by SEQ ID NO: 175 or nucleotide positions 1-1170 of SEQ ID NO: 177 
operably linked to an expression control sequence, or 

a method wherein increasing expression of FPP synthase comprises introducing into 
the mutant bacterium a vector comprising a polynucleotide sequence represented by 
nucleotides 295-1158 of SEQ ID NO:157 operably linked to an expression control 
sequence, or 

a method wherein increasing expression of an enzyme in a carotenoid pathway 
comprises introducing into the mutant bacterium a vector comprising a 
polynucleotide sequence selected from the group consisting SEQ ID NOs:180, 182, 
and 184 operably linked to an expression control sequence; 

(b) a method wherein the isoprenoid is isopentenyl diphosphate (IPP). 

(c) a method wherein the isoprenoid is a carotenoid, in particular a method wherein 
the carotenoid is selected from the group consisting of phytoene, lycopene, 0-caro- 
tene, zeaxanthin, canthaxanthin, astaxanthin, adonixanthin, cryptoxanthin, echinen- 
one, adonirubin, and combinations thereof; 
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(d) a method wherein the parent bacterium is a Paracoccus, in particular R- 1512 or R- 
1506, or R1534 or R114, in particular wherein the mutant is R114; 
(16) a microorganism of the genus Paracoccus, which microorganism has the following 
characteristics: 

5 (i) a sequence similiarity to SEQ ID NO: 12 of >97% using a similarity matrix obtained 

from a homology calculation using GeneCompar v. 2.0 software with a gap penalty of 

0%; 

a homology to strain R-1512, R1534, R114 or R-1506 of >70% using DNA:DNA 

hybridization at 81.5°C; 
0 a G+C content of its genomic DNA that varies less than 1% from the G-t-C content of 

the genomic DNA of R114, R-1512, R1534, and R-1506; and 

an average DNA fingerprint that clusters at about 58% similarity to strains R-1512, ( 
R1534, Rl 14 and R-1506 using the AFLP procedure of Example 2, with the proviso 
that the microorganism is not Paracoccus sp. (MBIC3966); 
5 (ii) 18:lw7c comprising at least about 75% of the total fatty acids of the cell 

membranes; 

an inability to use adonitol, i-erythritol, gentiobiose, P-methylglucoside, D-sorbitol, 
xylitol and quinic acid as carbon sources for growth; and 

an ability to use L-asparagine and L-aspartic acid as carbon sources for growth, with 
3 the proviso that the microorganism is not Paracoccus sp. (MBIC3966); or 

(iii) an ability to grow at 40°C; 
an ability to grow in a medium having 8% NaCl; 
an ability to grow in a medium having a pH of 9. 1 ; and 

a yellow-orange colony pigmentation, with the proviso that the microorganism is no^ 
> Paracoccus sp. (MBIC3966). 

The following examples are provided to further illustrate certain aspects of the present in- 
vention. These examples are illustrative only and are not intended to limit the scope of the 
invention in any way. 

Example 1: Analytical and Biochemical Methods 

) fa) Analysis of Carotenoids 

Sample preparation . A solvent mixture of 1:1 dimethylsulfoxide (DMSO) and tetrahydro- 
furan (THF) was first prepared. This solvent mixture was stabilized by the addition of 
butylated hydroxytoluene (BHT, 0.5 g/1 solvent mixture). Four milliliters of the stabilized 
DMSO/THF mixture was added to 0.4 ml of bacterial culture in a disposable 15-ml poly- 
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propylene centrifuge tube (gives a final dilution factor of 1/11). The tubes were capped 
and mixed using a Vortex mixer for 10 seconds each. The samples were then put on a 
Brinkmann Vibramix shaker for 20 minutes. The tubes were centrifuged at room 
temperature for 4 minutes at 4000 rpm and aliquots of the clear yellow/orange supernatant 
were transferred into brown glass vials for analysis by High Performance Liquid 
Chromatography (HPLC). 

HPLC . A reversed phase HPLC method was developed for the simultaneous determina- 
tion of astaxanthin, zeaxanthin, canthaxanthin, P-carotene, and lycopene. The method 
was also able to separate the main as-isomers of zeaxanthin. Chromatography was per- 
formed using an Agilent 1100 HPLC system equipped with a thermostatted autosampler 
and a diode array detector. The method parameters were as follows: 
Column: YMC Carotenoid C30 column, particle size 5 micron 

250* 4.6mm I.D., steel 

(YMC, Part No. CT99S052546WT) 
Guard column: Pelliguard LC-18 cartridge, 20 mm 

(SUPELCO, Part No. 59654) 
Mobile phase: Methanol (MeOH)/Methyl tert-butyl ether (TBME) gradient 





% MeOH 


% TBME 


Start 


80 


20 


10 min 


65 


35 


20 min 


10 


90 



Run time: 28 min; Typical column pressure: 90 bar at start; Flow rate: 1.0 ml/min.; 
20 Detection: UV at 450 nm; Injection volume: 10 jtl; Column temperature: 15°C 

Reagents . Methanol and TBME were HPLC grade and were obtained from EM Science 
and J.T. Baker, respectively. DMSO (Omnisolve) was purchased from EM Science. THF 
(HPLC solvent) was from Burdick and Jackson. 

Calculations . Quantitative analyses were performed with a two level calibration using 
25 external standards (provided by Hoffmann-La Roche, Basel, Switzerland). Calculations 
were based on peak areas. 

Selectivity . The selectivity of the method were verified by injecting standard solutions of 
the relevant carotenoid reference compounds. The target compounds (all-frans-caroten- 
oids) were completely separated and showed no interference. Some minor cis isomers may 
30 coelute, although these potentially interfering isomers are rare and need not be considered 
in routine analyses. The retention times of the compounds are listed in Table 1. 



10 



15 
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Table 1. HPLC retention times for carotenoids. 



Carotenoid 


Retention time (min.) 


Carotenoid 


Retention time (min.) 


Astaxanthin 


6.99 


Canthaxanthin 


9.95 


Adonixanthin 


7.50 


Cryptoxanthin 


13.45 


1 5-cis- Zeaxanthin 


7.80 


P-Carotene 


17.40 


1 3 - as- Zeaxan thi n 


8.23 


Lycopene 


21.75 


all-frans-Zeaxanthin 


9.11 







Linearity . 25 Milligrams of all-frans-zeaxanthin were dissolved in 50 ml of DMSO/THF 

i 

mixture (giving a final zeaxanthin concentration 500 p.g/ml). A dilution series was pre- 
5 pared (final zeaxanthin concentrations of 250, 100, 50, 10, 5, 1, and 0.1 fxg/ml) and ana- 
lyzed by the HPLC method described above. A linear range was found from 0.1 |Xg/ml to 
250 |Lig/ml. The correlation coefficient was 0.9998. 

Limit of detection .The lower limit of detection for zeaxanthin by this method was deter- 
mined to be 60 |Xg/L A higher injection volume and optimization of the integration para- 

10 meters made it possible to lower the detection limit to approximately 5 (O.g/1 . 

Reproducibility . The retention time for all-frans-zeaxanthin was very stable (relative stan- 
dard deviation (RSD), 0.2 %). The peak area reproducibility, based on ten repetitive ana- 
lyses of the same culture sample, was determined to be 0.17 % RSD for all frans-zeaxanthin 
and 1.0 % for cryptoxanthin. 

V 

15 (b) Preparation of crude extracts and enzyme assay methods. 

Preparation of crude extracts . Crude extracts of Paracoccus and E. colt were prepared by 
resuspending washed cell pellets in 1 ml of extraction buffer (buffer used depended on the 
enzyme being assayed - compositions are specified along with each enzyme assay proce- 
dure described below). Cell suspensions were placed in a 2-ml plastic vial and disrupted 

20 by agitation with glass beads using a Mini Bead Beater 8 (Biospec Products, Bartlesville, 

OK, USA). Disruption was performed at 4°C using a medium agitation setting. The 

disrupted preparations were centrifuged at 21,000 x g for 20 minutes at 4°C to sediment 
the cell debris, and the supernatants were used directly for enzyme assays. 
Protein determinations . Protein concentrations in crude extracts were determined by the 
25 method of Bradford [Anal. Biochem. 72:248-254 (1976)] using the Bio-Rad Protein Assay 
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Reagent (Bio-Rad, Hercules, CA, USA). Bovine serum albumin was used as the reference 
protein for construction of standard curves. 

AcetvI-CoA acetyltransferase assays . Crude extracts were prepared in 150 mM EPPS 
(N-[2-hydroxyethyl] piperizine-N'-[3-propanesulfonic acid]) buffer, pH 8.0. Assays were 
5 performed in the thiolysis direction according to the method described by Slater et al. [J. 
Bacteriol. 180:1979-1987 (1998)]. This assay measures the disappearance of acetoacetyl- 
CoA spectrophotometrically at 304 nm. Reaction mixtures contained 150 mM EPPS 
buffer (pH 8.0), 50 mM MgCl 2 , 100 ^iM CoA, 40 ^M acetoacetyl-CoA and crude extract. 
Reactions were carried out at 30°C and were initiated by addition of crude extract. The 
10 disappearance of acetoacetyl-CoA at 304 nm was monitored using a SpectraMAX Plus 

plate reader (Molecular Devices Corp., Sunnyvale, CA, USA) and a quartz microtiter plate 
(any standard spectrophotometer can also be used). Activity (expressed as U/mg protein) 
was calculated using a standard curve constructed with acetoacetyl-CoA (1 unit of activity 
= 1 nmol acetoacetyl-CoA consumed/min.). The lower limit of detection of Acetyl-CoA 
15 acetyltransferase activity was 0.006 U/mg. 

HMG-CoA syntha se assays . HMG-CoA synthase was assayed according to the method of 
Honda et al. [Hepatology 27:154-159 (1998)]. In this assay, the formation of HMG-CoA 
from acetyl-CoA and acetoacetyl-CoA is measured directly by separating the reaction pro- 
duct and substrates by HPLC. Crude extracts were prepared in 50 mM Tris-HCl buffer 
20 (pH 8.0). Reaction mixtures (0.1 ml) contained 50 mM Tris-HCl buffer (pH 8.0), 0.1 mM 
EDTA, 20 mM MgCl 2 , 0.1 mM acetoacetyl-CoA, 0.8 mM acetyl-CoA and crude extract. 
Reactions were pre-incubated for 2 minutes at 30°C before adding the crude extract. After 
5 minutes of reaction at 30°C, the reactions were stopped by adding 0.2 ml of 200 mM 
tetra-butyl ammonium phosphate (TBAP, dissolved in methanol-water (3:2, final pH was 
25 5.5) and containing 0.2 mM propionyl-CoA as an internal recovery standard). The mix- 
ture was then centrifuged for 3 minutes at 21,000 x g at 4°C and subsequendy kept on ice 
until analyzed by reversed phase ion-pair HPLC. HMG-CoA and propionyl-CoA were 
separated from acetyl-CoA and acetoacetyl-CoA using a Nova-Pak C18 column (3.9 x 150 
mm, Waters Corporation, Milford, MA, USA). The injection volume was 20 the 
30 mobile phase was 50 mM TBAP dissolved in methanol-water (1:1, final pH was 5.5), and 
the flow rate was 1.0 ml/min. HMG-CoA and propionyl-CoA were detected by 
absorbance at 254 nm. HMG-CoA produced in the reaction was quantified by 
comparison with a standard curve created using authentic HMG-CoA. Activity is defined 
as U/mg protein. One unit of activity = 1 nmol HMG-CoA produced/min. The lower 
35 limit of detection of HMG-CoA synthase was about 1 U/mg. 
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HMG-CoA reductase assays . Crude extracts were prepared in 25 mM potassium phos- 
phate buffer (pH 7.2) containing 50 mM KC1, 1 mM EDTA and a protease inhibitor cock- 
tail (Sigma Chemical Co., St. Louis, MO, USA, catalog #P-2714). Assays were performed 
according to the method of Takahashi et al. [J. Bacteriol. 181:1256-1263 (1999)]. This 
5 assay measures the HMG-CoA dependent oxidation of NADPH spectrophotometricaUy at 
340nm. Reaction mixtures contained 25 mM potassium phosphate buffer (pH 7.2), 
50 mM KC1, 1 mM EDTA, 5 mM dithiothreitol, 0.3 mM NADPH, 0.3 mM J?,S-HMG-CoA 
and crude extract. Reactions were performed at 30°C and were initiated by the addition of 
HMG-CoA. HMG-CoA-dependent oxidation of NADPH was monitored at 340 nm using 

10 a SpectraMAX Plus plate reader (Molecular Devices Corp., Sunnyvale, CA, USA) and a 
quartz microtiter plate (any standard spectrophotometer may be used). Activity 
(expressed as U/mg protein) was calculated using a standard curve constructed with 
NADPH (1 unit of activity = 1 M-mol NADPH oxidized/min.). The lower limit of detection 
of HMG-CoA reductase activity was 0.03 U/mg. 

15 Mevalonate kinase, phosphomevalonate kinase and mevalonate diphosphate decarboxylase 
assays . The preparation of substrates and the assay procedures for mevalonate kinase, 
phosphomevalonate kinase and mevalonate diphosphate decarboxylase have been 
described in detail by Popjak [Methods Enzymol. 15:393-425 (1969)]. For all assays, one 
unit of enzyme activity is defined as l^tmol of product formed/minute. In addition to 

20 these spectrophotometry and radiochromatographic assays, alternate methods, for 
example using HPLC separation of reaction substrates and products, can be used. The 
lower limit of detection of mevalonate kinase, phosphomevalonate kinase and mevalonate 
diphosphate decarboxylase is typically about 0.001 U/mg protein. 

IPP isomerase assays . Crude extracts were prepared in 50 mM Tris-HCl buffer (pH 7.5).^ 
25 Assays were performed using the method of Spurgeon et al. [Arch. Biochem. Biophys. 
230:445-454 ( 1984)] . This assay is based on the difference in acid-lability of IPP and 
DMAPP. Reaction mixtures (0.1 ml final volume) contained 50 mM Tris-HCl buffer (pH 
7.5), 2 mM dithiothreitol, 5 mM MgCl 2 > 20 [iM [1- I4 C]-IPP and crude extract. Reactions 
were carried out at 30°C for 15 minutes and terminated by the addition of 0.3 ml of a mix- 
30 ture of concentrated HCkmethanol (4:1) and an additional incubation at 37°C for 20 min. 
Hexane (0.9 ml) was added and the tubes were mixed (4 times for 10 seconds using a vor- 
tex mixer). After centriftigation (21,000 x g, 5 minutes), 0.6 ml of the hexane layer was 
transferred to a scintillation vial, scintillation fluid was added, and the radioactivity 
counted. Activity is expressed as U/mg protein. One unit of activity = 1 pmol [1- 14 C]-IPP 
35 incorporated into acid labile products/min. The lower limit of detection of IPP isomerase 
activity was 1 U/mg. 
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FPP synthase assays . Crude extracts were prepared in 50 mM Tris-HCl buffer (pH 8.0). 
The FPP synthase assay procedure was similar to the IPP isomerase assay described above, 
being based on the difference in acid lability of IPP and FPP (Spurgeon et al., supra). 
Reaction mixtures (0.1 ml final volume) contained 50 mM Tris-HCl buffer (pH 8.0), 
5 2 mM dithiothreitol, 5 mM MgCl 2 , 20 ^iM [1- I4 C]-IPP, 25 |iM GPP (geranyl pyrophos- 
phate) and crude extract. Reactions were carried out at 30°C for 15 minutes and termina- 
ted by the addition of 0.3 ml of a mixture of concentrated HClrmethanol (4:1) and an 
additional incubation at 37°C for 20 minutes. Hexane (0.9 ml) was added and the tubes 
were mixed (4x, 10 seconds using a vortex mixer). After centriftigation (21,000 x g, 5 
10 minutes), 0.6 ml of the hexane layer was transferred to a scintillation vial, scintillation fluid 
was added, and the radioactivity counted. Units of enzyme activity, and the lower limit of 
C~ ; detection, were the same as defined above for IPP isomerase. In cases where high IPP 

isomerase activity interferes with measurement of FPP synthase activity, crude extract may 
be preincubated for 5 minutes in the presence of 5mM iodoacetamide to inhibit IPP 

15 isomerase activity. 

GGPP synthase assays . Crude extracts were prepared in 50 mM Tris-HCl buffer (pH 8.0) 
containing 2 mM dithiothreitol. GGPP synthase was assayed according to the procedure 
of Kuzuguchi et al. [J. Biol. Chem. 274:5888-5894 (1999)]. This assay is based on the same 
principle as described above for FPP synthase. Reaction mixtures (0.1 ml final volume) 

20 contained 50 mM Tris-HCl buffer (pH 8.0), 2 mM dithiothreitol, 5 mM MgCl 2 , 20 JjlM [1- 

14 C]-IPP, 25 liM FPP and crude extract. All reaction conditions and subsequent treatment 
of samples for scintillation counting were identical to those described above for FPP syn- 
thase. Treatment of extract with iodoacetamide to inhibit IPP isomerase activity may also 
be used as above. Units of enzyme activity, and the lower limit of detection, were the same 

25 as defined above for IPP isomerase. 

Acetoacetyl-CoA reductase assays . Crude extracts are prepared in 50 mM Tris-HCl buffer 
(pH 7.5) containing 50 mM KC1 and 5 mM dithiothreitol. Acetoacetyl-CoA reductase was 
assayed according to the procedure of Chohan and Copeland [Appl. Environ. Microbiol. 
64:2859-2863 (1998)]. This assay measures the acetoacetyl-CoA-dependent oxidation of 

30 NADPH spectrophotometrically at 340 nm. Reaction mixtures (1 ml) contain 50 mM 
Tris-HCl buffer (pH 8.5), 15 mM MgCl 2 , 250 [iM NADPH, and 100 y.M acetoacetyl-CoA. 
Reactions are performed at in a quartz cuvette at 30°C and are initiated by the addition of 
acetoacetyl-CoA. Activity (expressed as U/mg protein) was calculated using a standard 
curve constructed with NADPH (1 unit of activity = 1 \xmo\ NADPH oxidized/min). The 

35 lower limit of detection of acetoacetyl-CoA reductase activity is about 0.01 U/mg. 
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Example 2: Taxonomic Reclassification of Flavobacterium sp. as Paracoccus 

This Example describes the taxonomic re-classification of the zeaxanthin-producing bacte- 
rium formerly designated Flavobacterium sp. strain R-1512 (ATCC 21588) as Paracoccus 
sp. strain R-1512 (ATCC 21588). A comprehensive genomic and 
5 biochemical/physiological analysis was performed by the Belgian Coordinated Collections 

of Microorganisms/Laboratorium voor Microbiologic Universiteit Gent (BCCM™/LMG), 
using state-of-the-art methods currently accepted as the scientific standards for bacterial 
classification. Besides Paracoccus sp. strain R-1512, several other bacteria belonging to the 
genus Paracoccus were included in the study (summarized in Table 2). 

( 



10 Table 2. Bacteria used in taxonomic study. 



Bacterium 


Strain designation 


Source or reference 


Paracoccus sp. 


R-1512 (ATCC 21588) 


ATCC (environmental isolate) 
US 3,891,504 


Paracoccus sp. 


R1534 


mutant derived from R-1512; US 6,087,152 


Paracoccus sp. 


R114 


mutant derived from R-1512; This work 


Paracoccus sp. 


R-1506 


environmental isolate; This work 


Paracoccus sp. 


MBIC3024 


H. Kasai, Kamaishi Institute, Japan 


Paracoccus sp. 


MBIC3966 


H. Kasai, Kamaishi Institute, Japan / 


Paracoccus sp. 


MBIC4017 


H. Kasai, Kamaishi Institute, Japan 


Paracoccus sp. 


MBIC4020 


H. Kasai, Kamaishi Institute, Japan 


P. marcusii 


DSM 11574 T 


Harker et al., infra 


P. carotinifaciens 


E-396 T 


Tsubokura et al., infra 


P. solventivorans 


DSM 6637 T 


Siller et. al., Int. J. Syst. Bacteriol. 
46:1125-1130(1996) 



Strains R1534 and Rl 14 are mutants derived from strain R-1512 by classical mutagenesis 
and screening for improved zeaxanthin production. The primary screening was accom- 
plished by selecting the highest color intensity producing colonies. A secondary screening 
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was accomplished in liquid culture media by the HPLC methods according to Example 1. 
Strain R-1506 is an independent isolate obtained from the same initial screening of en- 
vironmental microorganisms that provided strain R-1512. Strains MBIC3024, MBIC3966, 
MBIC4017 and MBIC4020 were identified as members of the genus Paracoccus by the 
5 nucleotide sequences of their 16S rDNA genes (DNA sequences were deposited in the 
public EMBL database, see Table 5). Paracoccus marcusii DSM 1 1574 T and Paracoccus 
carotinifaciens E-396 are recendy described type strains of carotenoid-producing bacteria 
[Harker et aL, Int. J. Syst. Bacteriol. 48:543-548 (1998); Tsubokura et aL, Int. J. Syst. 
Bacteriol. 49:277-282 (1999)]. Paracoccus solventivorans DSM 6637 T was included as a 
10 "control" strain, being a member of the genus Paracoccus but distantly related to the other 
bacteria used. 

Preliminary experiments resulted in the following conclusions. Each of the methods set 
forth herein has a well- recognized ability to define taxonomic relatedness or relative degree 
of similarity between organisms. The methods and their use for delineating bacterial taxa 

15 were described and compared in detail by Van Damme et aL, Microbiological Reviews 
60:407-438 (1996) and Janssen et aL, Microbiology 142:1881-1893 (1996). 
(1) Fatty acid analysis of the cell membranes of strains R1534 and Rl 14 showed that the 
two strains were highly similar and indicated a taxonomic relatedness of these strains to 
Paracoccus denitrificans and Rhodobacter capsulatus. 

20 (2) One-dimensional gel electrophoresis of cellular proteins showed a high similarity (i.e., 
a relatedness at the intra-species level) between R1534 and R114, but the profiles did not 
justify allocation of these strains to either R. capsulatus or P. denitrificans. 

(3) DNA:DNA hybridization between strain R1534 and R. capsulatus LMG2962 T and P. 
denitrificans LMG4218 T confirmed that strain R1534 is neither R. capsulatus nor P. denitri- 

25 ficans. 

(4) Sequencing of 16S rDNA genes from strains R1534 and Rl 14 showed that these orga- 
nisms belong to the genus Paracoccus, but that they represent a new species. The highest 
degree of sequence similarity was observed with the 16S rDNA gene of Paracoccus sp. 
strains MBIC3966, MBIC4020 and MBIC3024. 

30 (5) DNA fingerprinting of strains R1534 and R-1512 using Amplified Fragment Length 

Polymorphism ( AFLP™ ) showed high overall similarity of the genomic DNA from the two 

strains, indicating an infraspecific relatedness (i.e. AFLP™ can differentiate between two 
members of the same species). 
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In the following sections, the results and conclusions of the present comprehensive taxo- - 
nomic study of Paracoccus sp. strain R-1512 (and its mutant derivatives R1534 and Rl 14) 
are set forth. 

16S rDNA sequencing and phvlogenetic study . The bacteria set forth in Table 2 were 
5 grown in LMG medium 185 ((TSA) BBL 11768 supplemented where necessary with 1.5% 
Difco Bacto agar). Genomic DNA was prepared according to the protocol of Niemann et 
al. [J. Appl. Microbiol. 82:477-484 (1997)]. Genes coding for 16S rDNA were amplified 
from genomic DNA from strains R-1512, R1534, R114 and R-1506 by polymerase chain 
reaction (PCR) using the primers shown in Table 3. 



10 



Table 3- Primers used for PCR amplification of DNA coding for 1 6S rDNA in Paracoccu^ 
sp. strains R-1512, R1534, R114, and R-1506. 



Primer name* 


Sequence (5'-»3') 


SEQ ID NO 


Position 6 


16F27 


AGA GTT TGA TCC TGG CTC AG 


SEQ ID NO:l 


8-27 


16F38 


CTG GCT CAG GAC/T GAA CGC TG 


SEQ ID NO:2 


19-38 


16R1522 


AAG GAG GTG ATC CAG CCG CA 


SEQ ID NO:3 


1541-1522 



15 



20 



l F, forward primer; R, reverse primer. Forward primer 16F27 (Synonym: PA) was used 
for strains R1534 and R-1506, while forward primer 16F38 (Synonym: ARI C/T) was used 
for strains R-1512 and R114. The reverse primer 16R1522 (Synonym: PH) was used for all 



strains 
b 



Hybridization position referring to E. coli 16S rDNA gene sequence numbering. 



The PCR-amplified DNAs were purified using the Qiaquick PCR Purification Kit (Qiagen 
GmbH, Hilden, Germany). Complete sequencing was performed using an Applied Bio- 
systems, Inc. 377 DNA Sequencer and the protocols of the manufacturer (Perkin-Elmer, 

Applied Biosystems Division, Foster City, CA, USA) using the "ABI PRISM™ Big Dye™ 

Terminator Cycle Sequencing Ready Reaction Kit (with AmpliTaq® DNA Polymerase, 
Fs)". The primers used for DNA sequencing are shown in Table 4. 

Table 4. Primers used for sequencing PCR- amplified segments of genes coding for 16S 
rDNA in Paracoccus sp. strains R-1512, R1534, Rl 14 and R-1506. 



Primer name 3 /- 


Sequence (5*— >3') 


SEQ ID NO 


Position 15 


Synonym 
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lorJDo/ oamma 


C^TC* r^T A C^C^C^ AC C^C A CC A f^T 
1 /\ v^IjVj vj/\v_j VjV^A. v_j^/\ 1 


OH.V£ 1L/ iNv-/.fk 




lor53o/ FJJ 


C AC C AC. C^C*C dCC TA A TAT 


OEV^ 11-/ IN VJ.D 


JI7- JJU 


lor 926/ O 


A A A A AT TA A TTP A r^f^ 




Qfta Q9A 
7UO-7/O 


16F1112/ 3 


AO I vjL-A ACO ALjC uLA AC 






1 /CC1 0/11 /*D 


r:<~T AI^A fAf r^TH PTA fAA TCi 


SFO TD NOR 

Oi-.V^ 11-/ INV/.O 




16R339/Gamma 


ACT GCT GCC TCC CGT AGG AG 


SEQ ID NO:9 


358-339 


16R519/PD 


GTA TTA CCG CGG CTG CTG 


SEQID NO: 10 


536-519 


16R 1093/3 


GTT GCG CTC GTT GCG GGA CT 


SEQ ID NO: 11 


1112-1093 



a F, forward primer; R, reverse primer. 

b Hybridization position referring to £. co/i 16S rDNA gene sequence numbering. 



Five forward and three reverse primers were used to obtain a partial overlap of sequences, 
ensuring highly reliable assembled sequence data. Sequence assembly was performed 
5 using the program AutoAssembler (Perkin-Elmer, Applied Biosystems Division, Foster 
City, CA, USA). Phylogenetic analysis was performed using the software package Gene- 

Compar™ (v. 2.0, Applied Maths B.V.B.A., Kortrijk, Belgium) after including the con- 
sensus sequences (from strains R-1512, R1534, R114 and R-1506) in an alignment of small 
ribosomal subunit sequences collected from the international nucleotide sequence library 
10 EMBL. This alignment was pairwise calculated using an open gap penalty of 100% and a 
unit gap penalty of 0%. A similarity matrix was created by homology calculation with a 
gap penalty of 0% and after discarding unknown bases. A resulting tree was constructed 
using the neighbor-joining method. 

The nucleotide sequence of the 16s rDNA gene from Paracoccus sp. strain R-1512 is 
15 illustrated as SEQ ID NO: 12. The distance matrix, presented as the percentage of 16S 
rDNA sequence similarity, between strain R-1512 and its closest relatives, is shown in 
Table 5. The sequences from strains R-1512 and its mutant derivatives R1534 and R114 
were identical. The sequence from R-1506 differed by only one nucleotide from the 
sequence from latter strains. This demonstrated strains R-1512 and R-1506 are phylo- 
20 genetically highly related and likely belong to the same species (confirmed by DNA:DNA 
hybridization, see below). Comparison of the R-1512 and R-1506 sequences with those 
publicly available at the EMBL library located R-1512 and R-1506 in the genus Paracoccus. 
However, the sequence similarities observed with all currently taxonomically validly 
described Paracoccus species was <97%, the value generally accepted as the limit for a 
25 possible relatedness at the species level [Stackebrandt and Goebel, Int. J. Syst. Bacteriol. 
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44:846-849 (1994)]. This demonstrated that strains R-1512 (and its mutant derivatives) 
and R-1506 belong to one or two new Paracoccus species. Sequence similarities of >97% 
(significant for a possible relationship at the species level), were observed between four 
unnamed Paracoccus strains and strains R-1512, R1534, R114 and R-1506, suggesting that 
5 one or more of the unnamed (MBIC) strains may relate at the species level to strains 
R-1512 and R-1506. Based on cluster analysis (phylogenetic tree depicting the phylo- 
genetic relatedness between Paracoccus sp. strains R-1512, R1534, R114, R-1506, 
MBIC3966, and other members of the genus Paracoccus)^ strains R-1512, R1534, Rl 14, R- 
1506 and four unnamed Paracoccus strains (MBIC3024, MBIC3966, MBIC4017 and 
10 MBIC4020) were selected for DNA:DNA hybridization experiments to analyze species 
relatedness. 

Table 5. Distance matrix, presented as the percentage of 16S rDNA sequence similarity, 
between Paracoccus sp. strain R-1512 and its closest relatives. 



Strain 3 


EMBL Accession number 


% Similarity 


R-1512 




1 100 


R1534 




100 


R114 


1 100 


R-1506 


1 99.9 


Paracoccus sp. MBIC3966 


ABO 18688 


1 100 


Paracoccus sp. MBIC3024 


AB008115 


1 98.2 


Paracoccus sp. MBIC4020 


AB025191 


98.1 


Paracoccus sp. MBIC4036 


AB025192 


— — C 

97.0 


Paracoccus sp. MBIC4017 


AB025188 


96.9 


Paracoccus sp. MBIC4019 


AB025190 


96.8 


Paracoccus sp. MBIC4018 


AB025189 j 


96.4 


Paracoccus marcusii DSM 11574 T 


Y 12703 


96.2 


Paracoccus carotinifaciens E-396 


AB006899 1 


96.1 


Paracoccus solventivorans DSM 6637 


Y07705 1 


95.4 


Paracoccus thiocyanaticus THIOl 1 T 


D32242 I 


95.3 


Paracoccus aminophilus JCM 7686 


D32239 1 


95.1 


Paracoccus alcaliphilus JCM 7364 


D32238 


95.0 
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Paracoccus pantotrophicus ATCC 35512 


Y 16933 


95.0 


Paracoccus denitrificans ATCC 17741 


Y16927 


94.8 


Paracoccus versutus I AM 12814 


D32243 


94.7 


Paracoccus kocurii JCM 7684 


D32241 


94.6 


Paracoccus aminovorans JCM 7685 


D32240 


94.4 


Paracoccus alkenifer A901/1 7 


Y13827 


94.3 


Rhodobacter capsulatus ATCC 1 1 166 

^ a : =— n - i 


D 16428 


92.9 



Type strains are followed by a 



DNArDNA hybridization and determination of G-t-C content . The bacteria set forth in 
\^ Table 5 were grown in LMG medium 185. Genomic DNA was prepared according to the 

protocol of Wilson [In Ausabel et al. (eds.)> Current Protocols in Molecular Biology, 
5 Greene Publishing and Wiley Interscience, New York, 2.4.1-2.4.5 (1987)]. The G+C 
content of the DNA's was determined by HPLC according to Mesbach et al. [Int. J. Syst. 
Bacteriol. 39:159-167 (1989)] as modified by Logan et al. [Int. J. Syst. Evol. Microbiol. 
50:1741-1753 (2000)]. Reported values are the mean of these measurements on the same 
DNA sample. DNA:DNA hybridizations were performed using the initial renaturation 
10 rate method as described by De Ley et al. [Eur. J. Biochem. 12:133-142 (1970)]. The 

hybridization temperature was 81.5°C. For this method, an average deviation of -h/-5.8% 
has been reported by Vauterin et al. [Int. J. Syst. Bacteriol. 45:472-489 (1995)]. The G+C 
content of the bacterial DNA's and the results of the DNA hybridization experiments are 
_ summarized in Table 6. 

15 Table 6. G+C content (mol %) of DNA from Paracoccus spp. strains and per cent DNA 
homology between the strains. 



Strain 


%G+C 


% DNA Homology 


R-1512 


67.6 


100 
















R1534 


67.7 


96 


100 














R114 


67.5 


100 


97 


100 












R-1506 


67.5 


94 


90 


88 


100 










MBIC3024 


65.4 


31 


nd a 


nd 


31 


100 








MBIC3966 


66.9 


93 


nd 


nd 


88 


32 


100 






MBIC4017 


67.2 


32 


nd 


nd 


31 


24 


24 


100 
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MBIC4020 


68.4 


27 


nd 




nd 


25 


25 


23 


34 


100 



a not determined 



Strains R-1512, R1534, R114, R-1506 and MBIC3966 showed a DNA homology of >70% 
(the generally accepted limit for species delineation [Wayne et ah, Int. J. Syst. Bacteriol. 
37:463-464 (1987)], and therefore belong to the same species within the genus Paracoccus. 
5 The G+C content of these five strains varied from 66.9%-67.7%, thus remaining within 
1%, characteristic for a well defined species. On the other hand, the low DNA homology 
between strains MBIC3024, MBIC4017 and MBIC4020 and strains R-1512, R1534, R114, 
R-1506 and MBIC3966 showed that MBIC3024, MBIC4017 and MBIC4020 each belong to 
a different genomic species within the genus Paracoccus. 

10 DNA fingerprinting using AFLP™ . AFLP™ is a PCR-based technique for whole genome 
DNA fingerprinting via the selective amplification and selective visualization of restriction 
fragments [Vos et al„ Nucleic Acids Research 23:4407-4414 (1995); Janssen et aL, supra]. 
In this analysis, Paracoccus sp. strains R-1512, R1534, R114, R-1506, MBIC3966, and Para- 
coccus marcusii DSM 11574 were compared to evaluate infraspecies relatedness. These 

15 bacteria were grown in LMG medium 185. Genomic DNA from each of these bacteria was 
prepared according to the protocol of Wilson (supra). Purified DNA was digested by two 
restriction enzymes, a 4-base cutter and a 6-base cutter. In this way, a limited number of 
fragments with two different ends and of suitable size for efficient PCR were obtained. 
Adaptors (small double-stranded DNA molecules of 15-20 bp) containing one compatible 

20 end were ligated to the appropriate "sticky" end of the restriction fragments. Both adap- 
tors are restriction halfsite-specific, and have different sequences. These adaptors serve as 
binding sites for PCR primers. Here, the restriction enzymes used were Apal (a 
hexacutter, recognition sequence GGGCC/G) and Taql (a tetracutter, recognition 
sequence T/GCA). The sequences of the adaptors ligated to the sticky ends generated by 

25 cleavage with the restriction enzymes are shown in Table 7 (SEQ ID Nos: 13-22). PCR was 
used for selective amplification of the restriction fragments. The PCR primers specifically 
annealed with the adaptor ends of the restriction fragments. Because the primers contain, 
at their 3' end, one so-called "selective base" that extends beyond the restriction site into 
the fragment, only those restriction fragments that have the appropriate complementary 

30 sequence adjacent to the restriction site were amplified. The sequences of the six PCR 
primer combinations used are also shown in Table 7. 

Table 7. Adaptors and PCR primers used for AFLP™ analysis. 
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Sequence 


SEQ ID NO I 


Adaptors corresponding to restriction enzyme Apal 1 


Adaptor 93A03 


5 ' -TCGTAGACTGCGTAC AGGCC- 3 ' 


SEQIDNO:13 


Adaptor 93A04 


3'-CATCTGACGCATGT-5' 


SEQIDNO:14 


Adaptors corresponding to restriction enzyme Taql 


Adaptor 94A0 1 


5'-GACGATGAGTCCTGAC-3' 


SEQ ID NO: 15 


Adaptor 94A02 


3'-TACTCAGGACTGGC-5' 


SEQIDNO:16 




Sequence 


SEQ ID NO 


Primer combination 1 (PC A) I 


A01 


S'GACTGCGTACAGGCCCAy 


SEQ ID NO: 17 


T01 


5 ' CGATGAGTCCTGACCGAA3 ' 


SEQ ID NO: 18 


Primer combination 2 (PC B) I 


A01 


5 , GACTGCGTACAGGCCCA3' 


SEQIDNO:17 


T02 


5 'CGATGAGTCCTGACCG AC3 ' 


SEQIDNO:19 


Primer combination 3 (PC D) I 


A02 


S'GACTGCGTACAGGCCCCy 


SEQIDNO.-20 


T01 


5 ' CG ATG A GTCCTG ACCGAA3 ' 


SEQ ID NO: 18 


Primer combination 4 (PC I) I 


A03 


5'GACTGCGTACAGGCCCG3' 


SEQIDNO:21 


T03 


5 ' CG ATG AGTCCTG ACCGA G3 ' 


SEQIDNO:22 


Primer combination 5 (PC G) 1 


A03 


5'GACTGCGTACAGGCCCG3' 


SEQIDNO:21 


T01 


5'CGATGAGTCCTGACCGAA3' 


SEQ ID NO: 18 


Primer combination 6 (PC H) 1 


A03 


5'GACTGCGTACAGGCCCG3' 


SEQIDNO:21 


T02 


5'CGATGAGTCCTGACCGAC3' 


SEQ ID NO: 19 



Following amplification, the PCR products were separated according to their length on a 
high resolution polyacrylamide gel using a DNA sequencer (ABI 377). Fragments that 
contained an adaptor specific for the restriction halfsite created by the 6-bp cutter were 
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visualized by autoradiography due to the 5'-end labeling of the corresponding primer with 
32 P. The electrophoretic patterns were scanned and numerically analyzed with Gel- 
Compar™ 4.2 software (Applied Maths, B.V.B.A., Kortrijk, Belgium) and clustered using 
the Pearson curve matching coefficient and unweighted pair group averages linking 
5 [clustering methods were reviewed by Sneath and Sokal, In: Numerical Taxonomy. 
Freeman & Son, San Francisco (1973)]. 

In all six primer combinations (PC A-H, Table 7), the DNA fingerprints of Paracoccus sp. 
strains R-1512, R1534 and R114 were highly similar if not identical. In cases where minor 
differences were observed, reproducibility was not evaluated. The high similarity or 

10 identity among the three strains was expected as strains R1534 and Rl 14 were derived 

from strain R-1512. With all primer combinations, strains R-1512, R1534 and R114 were, 
clearly discriminated from strains R- 1 506 and MBIC3966, the latter two strains equally 
belonging to the new Paracoccus species. However, the fingerprints provide no clear 
indication that strains R-1512, R1534 and R114 are more related to either R-1506 or 

15 MBIC3966. Under the conditions used, the five strains of the new species cluster at an 
average level of about 58% similarity (this value is the mean of the six values of the 
branching points of the new species in the six AFLP™ experiments (six primer 
combinations)), and the cluster can clearly be discriminated from the profile of Paracoccus 
marcusii DSM 11574 T , the type strain of a phylogenetically related carotenoid-producing 

20 Paracoccus species. The mean similarity value of the six branching points for Paracoccus 
marcusii DSM 11574 T and the new species was about 11%. 

Fatt y acid analysis . The fatty acid composition of the cell membranes of Paracoccus sp. 
strains R-1512, R1534, Rl 14, R-1506, MBIC3966 were compared to the type strains P. 
marcusii DSM 11574 T , P. carotinifaciens E-396 and P. solventivorans DSM 6637 . The 

25 bacteria were grown for 24 hours at 28°C in LMG medium 185. The fatty acid composi- 
tions were determined by gas chromatography using the commercial system MIDI (Micro- 
bial Identification System, Inc., DE , USA). Extraction and analysis of fatty acids was per- 
formed according to the recommendations of the MIDI system. Table 8 summarizes the 
results for all strains tested. For the five strains of the new Paracoccus species (R-1512, 

30 R1534, Rl 14, R-1506, MBIC3966), the mean profile was calculated. All eight organisms 
showed a comparable fatty acid composition of their cell membranes, with 18:1 w7c as the 
major compound. Only minor differences in fatty acid composition were observed 
between the new Paracoccus species and the three type strains. 
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Utilization of carbon sources for growth . For testing the aerobic utilization of carbon 
sources, BIOLOG-SF-N Microplate microtiter plates (Biolog Inc., Hayward, CA, USA) 
containing 95 substrates were used with the exception that the substrate in well E6 was 
D,L-lactic acid methyl ester instead of the usual sodium salt of D,L-lactic acid. Cells from 
5 each of the strains identified in Table 9 were grown for 24 hours at 28°C in LMG medium 
12 (Marine Agar, Difco 0979). A cell suspension with a density equivalent to 0.5 
McFarland units was prepared in sterile distilled water. From this suspension, 18 drops 
were transferred into 21 ml of AUX medium (API 20NE, bioMerieux, France) and mixed 
gently. 0.1 Milliliters of the suspension was transferred to each well of the BIOLOG 
10 MicroPlates, and the plates were incubated at 30°C. Wells were visually checked for 

growth after 48 hours and after 6 days. Also, at 6 days the visual scoring was confirmed by 
reading the microtiter plates using the BIOLOG plate reader. 

The results of the BIOLOG analysis are shown in Table 9. Growth (positive reaction) was 
determined as increased turbidity compared to the reference well without substrate. A 

15 distinction was made between good growth (+), weak growth (±) and no growth (-). 

Results in parentheses are those obtained after 6 days if different from the results obtained 
after 48 hours. A question mark indicates an unclear result at 6 days. Of the 95 carbon 
sources tested, 12 could be used, and 47 could not be used, for growth by all five strains 
comprising the new Paracoccus species (R-1512, R1534, R114, R-1506 and MBIC3966). 

20 These five strains gave variable growth responses to the remaining 36 substrates. The new 
Paracoccus species could be distinguished from the two other carotenoid-producing 
bacteria (P. marcusii DSM 1 1574 T and P. carotinifaciens E-396 T ) by their inability to use 

seven carbon sources (adonitol, i-erythritol, gentiobiose, P-methylglucoside, D-sorbitol, 
xylitol and quinic acid). Two carbon sources that were utilized by all five members of the 
25 new Paracoccus species (L-asparagine and L-aspartic acid) were not used for growth by P. 
marcusii DSM 11574 T . 



Table 8. Fatty acid composition of cell membranes of Paracoccus sp. strains R-1512, 
R1534, R114, R-1506, MBIC3966 and three type strains of other Paracoccus species, Le. P. 
marcusii DSM 11574 , P. carotinifaciens E-396 and P. solventivorans DSM 6637 





Mean % for: 


% for: 


Name 


R-1512, R1534.R114, 
R-1506 and MBIC3966 


DSM 11574 T 


E-396 T 


DSM 663 7 T 


10:0 30H 


4.9 ± 1.1 


6.2 


3.4 


3.6 
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Unnamed 11 799 




4.9 


2.8 


— - 

3.0 


Unnamed 15 275 


I.J _L W.J 


2.9 


1.1 


ND a 


160 




ND 


0.3 


0.7 


1 "7. t - . -O — 

17:1 woe 


IN U 


ND 


0.6 


0.8 


17:0 


0.1+0.1 


ND 


0.3 


1.3 


18:1 w7c 


80.5 ± 1.8 


80.3 


84.0 


79.0 




Mean % for: 


% for: 


Name 


R-1512.R1534, R114, 
R-15U6 and Mr>lC3yoo 


DSM 11574 T 


E-396 T 


DSM 6637 T 

* 




D.O X U.*t 


2.6 


5.2 


6.6 




n /C ■+- n a 
U.o x U.4 


ND 


ND 


ND 




ND 


ND 


ND 


0.7 


20:1 w7c 


0 8 + 0 2 


ND 


0.2 


2.0 


Summed feature 2 


2.7 ± 0.4 


3.0 


2.1 


2.6 


Summed feature 3 


0.7 ± 0.5 


ND 


0.2 


ND 


TOTAL 


99.3 


99.9 


100.2 


100.3 



a ND, not detected 

Biochemical tests . Selected biochemical features were tested using the API 20NE strip 
(bioMerieux, France). Cells from each of the bacterial strains identified in Table 10 were 

grown for 24 hours at 28°C on LMG medium 12. Cell suspensions were prepared and 



5 strips inoculated according to the instructions of the manufacturer. Strips were incubated 
at 28°C and results determined after 24 and 48 hours. The results are summarized in Table 
10. Of the nine features tested, only one (urease activity) gave a variable response among 
the five strains of the new Paracoccus species. These nine tests did not differentiate 

T — 

between the new Paracoccus species and Paracoccus marcusii DSM 11574 1 and P. carotini- 
10 faciens E-396 T . 



Table 9. Utilization of carbon sources for growth by Paracoccus spp. strains. 





R1512 


R1534 


R114 


R1506 


MBIC 


DSM 


E-396 T 


DSM 












3966 


11574 T 

_ 




6637 T 
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c 



cc-Cyclodextnn 


















! Dextrin 
















-(±) | 


Glycogen 


















Tween 40 


- 


- 


- 


- 


- 


j - 


1 -(?) 




Tween 80 


- 


- 


- 


- 


- 


1 - 


j - 


- 


GalNAc 


- 


- 


- 


- 


- 






- 




R1512 

* 


R1534 


R114 


R1506 


MBIC 
3966 


DSM 
11574 T 


E-396 T 


DSM 
6637 T 


GlucNAc 
















-(?) 1 


Adonitoi 












i 

I 


~r 




L-Araoinose 












i 

T 




i I 
+ I 


Jj-Araoitoi 


i 

T 


i 

T 


i 

T 


i 

T 


±(+) 


l i 
i "*~ 


■ 




Cellobiose 


±(+) 


±(+) 


-(?) 


-(+) 


-(±) 


+ 


+ 


-(+) 1 


i-Erythritol 














+ 




D-rructose 


i 


i 

T 


■ 

T 


_i 
T 




+ 


i 


1 v 

+ 


L-Fucose 


















D-Galactose 


+ 


+ 


+ 


±(+) 


±(+) 


+ 


+ 


-(±) 1 


Gentiobiose 












+ 


+ 


-(±) 


a-D-Glucose 


+ 


+ 


+ 


±(+) 


-(+) 


+ 


±( + ) 


+ 


m-Inositol 


+ 


+ 


+ 


-(+) 


-(+) 


+ 


-(±) 




a- Lactose 


+ 


±( + ) 


-(+) 


-(+) 


-(+) 


+ 




±(+) 1 


Lactulose 


-(±) 


"( + ) 


-(+) 


-(±) 




+ 




-(+) 


Maltose 


+ 


+ 


-(+) 


-(+) 




+ 


+ 


-(+) 

1 


D-Mannitol 


+ 


+ 


+ 


+ 


-(+) 


+ 


+ 


-(+) 


D-Mannose 


+ 


+ 


+ 


+ 


-(±) 


+ 


+ 


-(+) 


D-Melibiose 




+ 


+ 


-(+) 


-(+) 


+ 


+ 


-(?) 


fJ-Methylgluc 












+ 


+ 


+ 
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D-Psicose 


-(+) 


±(+) 


±(+) 




-(+) 




± 




D-Raffinose 


- 


- 


- 


- 


— 


-(+) 


+ 


— 


L-Rhamnose 
















-(?) 


D-Sorbitol 


_ 


- 


- 


— 


- 


+ 


+ 




Sucrose 


+ 


+ 


±(+) 


-(+) 


- 


+ 


+ 


+ 


D-Trehalose 


+ 


+ 


-(+) 


-(+) 


-(+) 




+ 


+ 




R1512 


R1534 


R114 


R1506 


MBIC 
3966 


DSM 
11574 T 




E-396 T 


DSM 
6637 T 


Turanose 


-(+) 


-(+) 


■ 






+ 


+ 


+ 


V 1 'a. 1 

Xylitol 












a 

+ 




( 


Methylpyruvate 


± 


- 


± 


-(?) 


± 




+ 


-(±) 


MMSucc 


±(+) 


± 


-(±) 


-(+) 


-(±) 


-(+) 




- 


Acetic acid 




- 


± 


- 


- 


- 


- 


+ 


Cis-aconitic 
acid 


~ 


± 


± 


- 


- 


± 


- 


- 


Citric acid 




± 


■ 

± 







± 






Formic acid 


- 












- 


- 


GalAlactone 


-(±) 


-(±) 


-(+) 


- 


- 


- 


-(±) 


-(?) 


GalacturonicA 


- 


- 


- 


- 


- 


-(+) 


-(±) 


- C 


D-Gluconic 
acid 


+ 


4- 


+ 


-<±) 


-(±) 


+ 


+ 


+ 


oiucosa m i n ic/v 


















LiiucuronicA 


X 


T 


T 


-(±) 




±(+) 






AHBA 


-(±) 




-(±) 




-(+) 








BHBA 


+ 


+ 


+ 


-(±) 


± 


"( + ) 


+ 


+ 


GHBA 


















PHPAA 
















-(+) 


Itaconic acid 
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c 



A V D A 
















1 / i \ 


AKGA 








-(±) 


"(?) 


-(±) 


-(+) 


1 " (±) 1 


AKVA 


















LAME 


- 


- 


- 


f - 


| - 


- 


- 


r — 


Malonic acid 


- 


- 


- 


1 - 


- 


- 


- 


- 


Propionic acid 


- 


± 


± 


- 




± 


+ 


+ 




R1512 


R1534 


R114 


R1506 


MBIC 

3yoo 


DSM 


E-396 T 


DSM 
oo37 
















+ 




SaccA 


-(+) 


± 




-(+) 








\- 


OCUaClC alrlli. 


w J 




v w 


! 


"(±) 




V w 





ouccinic dciu 














i 

x 




BromosuccA 












± 






Succinamic acid 












-(+) 


-(+) 




Glucuronamide 










-(±) 








Alaninamide 


_ 


— 




_ 




-(+) 


+ 


— j 


D-Alanine 


- 


- 


-(+) 


- 


- 


- 


- 


- 


L-Alanine 


+ 


+ 


+ 


+ 


1 


-(+) 


+ 


+ 


L-Alanyl- 
dvcine 


-(+) 




-(+) 








-(+) 


-(?) 


L-Asparagine 


+ 




±(+) 1 


+ 


±(+) 1 




+ 




L-Aspartic acid 


+ 


+ 


±(+) 1 


-(+) 


-(+) 




+ 




L-Glutamic acid 




+ 


+ 


+ 1 




-(+) 


+ 


-(+) 


n a a 








-(±) 










VJV_7*\ 








1 






( 1 \ 

-(±) 




L-Histidine 












-(?) 




+ 


HydPro 
















+ 


L-Leucine 


-(±) 


-(+) 


-(+) 


-(+) 




-(+) 


-(?) 


-(+) 
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L-Ornithine 




-(+) 


±(+) 

> * 


-(±) 






-(+) 




L-Phenvlalanine 


















L-Proline 




+ 




+ 




-(+) 


+ 


+ 


PyroGluA 


+ 


+ 


+ 


+ 


±(+) 


-(+) 


+ 


- 


D-Serine 


















L-Serine 


± 


±(+) 


-(+) 


-(±) 


-(+) 


-(+) 


-(+) 


+ 




R1512 


R1534 


R114 


R1506 


MBIC 
3966 


DSM 

llD/4 


E-396 T 


DSM 

6637 


L-Threonine 












-(+) 






D T -f"!arnitine 
















* 

r 

( 


GABA 










-(+) 




-(+) 


-(+) 


Urocanic acid 


— 


- 


- 


— 




- 




-(+) 


Inosine 


- 


-(±) 


- 


- 


- 


-(±) 


-(+) 


-(+) 


Uridine 












-(+) 


-(+) 




Thymidine 


- 


- 


- 


- 


- 


-(±) 


-(±) 


- 


PEA 


- 


— 


— 


— 


— 


— 


— 


- 


Putresceine 


- 


- 


— 






— 




— 


2- 

Aminoethanol 
















I 

H. . 


2,3-Butanediol 


















Glycerol 


+ 


+ 


+ 


-(+) 




+ 


+ 




GlycP 










-(±) 








Gluc-l-P 














«(±) 




Gluc-6-P 



















GalNAc: N-Acetyl-D-galactosamine; GlucNAc: N-Acetyl-D-glucosamine; P-Methylgluc: 

(3-Methylglucoside; MMSucc: Mono-methylsuccinate; GalAlactone: D-Galactonic acid 
lactone; GalacturonicA: D-Galacturonic acid; GlucosaminicA: D-Glucosaminic acid; 
GlucuronicA: D-Glucuronic acid; AHBA: a-Hydroxybutyric acid; BHBA: (3-Hydroxy- 



5 butyric acid; GHBA: y-Hydroxybutyric acid; PHPAA: p-Hydroxyphenylacetic acid; AKBA: 
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a-Ketobutyric acid; AKGA: a-Ketoglutaric acid; AKVA: a-Ketovaleric acid; LAME: 
D,L-Lactic acid methyl ester; SaccA: D-Saccharic acid; BromosuccA: Bromosuccinic acid; 
GAA: Glycyl-L-aspartic acid; GGA: Glycyl-L-glutamic acid; HydPro: Hydroxy-L-proline; 
PyroGluA: L-Pyroglutamic acid; GABA: y-Aminobutyric acid; PEA: Phenylethylamine; 
5 GlycP: D,L-cc-Glycerolphosphate; Gluc-l-P: Glucose- 1- phosphate; Gluc-6-P: Glucoses- 
phosphate 

Table 10. Biochemical features of Paracoccus spp. strains: 12 = R1512; 34 = R1534; 14 = 



R114, 06 - R1506; 66 = MBIC3966; 74 = DSM 11574 T , 96 = E-396 T , 37 = DSM 6637 T 





12 


34 


14 


06 


66 


74 


96 


37 


Reduction nitrate to nitrite 
















+ 


Reduction nitrate to nitrogen 
















+ 


Indole from tryptophan 


















Fermentation of glucose 


















Arginine hydrolase 


















Urease 


S/ + 5 






S/ + 5 


+ 








Esculine hydrolysis 3 


weak 


S/ + 5 


S/ + 5 


+ 


+ 


+ 


+ 


+ 


Gelatine hydrolysis 1 * 


















P-Galactosidase 


+ 


+ 


+ 


+ 


+ 


+ 


+ 





a : (5-glucosidase; b : protease; S / + 5: Slow + 5 days 



10 Physiological tests . Several physiological and morphological tests were performed on the 
five strains of the new Paracoccus species, along with Paracoccus marcusii DSM 11574 > 
Paracoccus carotinifaciens E-396 and Paracoccus solventivorans DSM 6637 . The methods 
used for each test were as follows. 

Temperature range for growth. Cells were grown for 24 hours at 28°C on LMG medium 12. 

15 A cell suspension with a density of between 1-2 McFarland units was prepared in sterile 
distilled water. From this suspension, 3 drops were transferred onto the agar surface of 
LMG medium 12. One drop was diluted by streaking, the otfier 2 drops were left un- 
disturbed. The plates were incubated under aerobic conditions at 10°C, 25°C, 30°C, 33°C, 
37°C and 40°C, and checked for growth after 24 hours, 48 hours and 5 days. Growth was 

20 determined as visual growth (confluent in the drops and as colonies in the streaks with 
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diluted inoculum) compared to the growth at 30°C (i.e., the "control"). Scoring was done 
(vs. the control plate) as follows; better growth (++), good (equivalent to the control) 
growth (+), weaker growth (±), poor growth ( ± ), and no growth (-). Results in paren- 
theses are those observed in the streaks if different from the confluent growth in the un- 
5 disturbed drops. 

Salt tolerance. Cells were grown for 24 hours at 28°C on LMG medium 12. A cell suspen- 
sion with a density of between 1-2 McFarland units was prepared in sterile distilled water. 
From this suspension, 3 drops were transferred onto the agar surface of LMG medium 12 
supplemented with NaCl to reach final concentrations of 3%, 6% and 8%. One drop was 
10 diluted by streaking, the other 2 drops were left undisturbed. The plates were incubated 
under aerobic conditions at 28°C and checked for growth after 24 hours, 48 hours and 5 

/ 

days. Growth was determined as visual growth (confluent in the drops and as colonies in 
the streaks with diluted inoculum) compared to the growth without added NaCl (control). 
Scoring was done (vs. the control plate) as follows; better growth (++), good (equivalent 

15 to the control) growth (+), weaker growth (±), poor growth ( ± ), and no growth (-). 
Results in parentheses are those observed in the streaks if different from the confluent 
growth in the undisturbed drops. 

pH Range for growth. Cells were grown for 24 hours at 28°C in LMG medium 12. A cell 
suspension with a density of between 1-2 McFarland units was prepared in sterile distilled 
20 water. From this suspension, 3 drops were transferred into tubes containing 10 ml liquid 
LMG medium 12 with modified pH, giving final pH values after autoclaving of pH 6.1, pH 
6.3, pH 7.0, pH 7.7, pH 8.1 and pH 9.1. The liquid cultures were incubated aerobically 

(with shaking) at 28°C. Growth was checked at 24 hours, 48 hours, 3 days and 6 days. [ 
Growth was determined as increased turbidity (measured as % transmission using the 
25 BIOLOG turbidimeter) compared to growth at pH 7.0 (control). Scoring was done (vs. 
the control) as follows; better growth (++), good (equivalent to the control) growth (+ ), 

weaker growth (±), poor growth (±), and no growth (-). 

Starch hydrolysis. Cells were grown for 24 hours at 28°C on LMG medium 12 plates. A 
loopfiil of cells was taken from the plate and transferred as one streak onto the agar surface 
30 of LMG medium 12 supplemented with 0.2% soluble starch. Plates were then incubated 

under aerobic conditions at 28°C. When the strains had developed good growth (after 48 
hours), the plate was flooded with lugol solution (0.5% I2 and 1% KI in distilled water). 
Hydrolysis was determined as a clear zone alongside the growth (in contrast to the blue 
color of the agar where starch was not hydrolyzed). 
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Denitrification. Cells were grown for 24 hours at 28°C on LMG medium 12 plates. A 
loopful of cells was taken from the plate and stabbed into tubes containing semi-solid 
(0.1% agar) LMG medium 12 supplemented with 1% KNO3. The plates were incubated at 

28°C for 5 days. Denitrification (N2 from nitrate) was determined as gas formation 
5 alongside the stab. 

Growth under anaerobiosis without electron acceptor added. Cells were grown for 24 hours 
at 28°C on LMG medium 12 plates. A loopful of cells was taken from the plate and 
streaked onto the agar surface of LMG medium 12. The agar plates were incubated under 
anaerobic conditions (ca. 10% C0 2 + ca. 90% N 2 ) at 30°C. Plates were checked for growth 
10 after 24 hours and after 5 days. Growth was determined visually and compared to the 
aerobic (control) condition. Scoring was done (vs. the control) as follows; better growth 

(++), good (equivalent to the control) growth (+), weaker growth (±), poor growth (±), 
and no growth (-). 

Growth under anaerobic conditions with glucose added (fermentation). Cells were grown for 
15 24 hours at 28°C on LMG medium 12 plates. A loopful of cells was taken from the plate 
and stabbed into tubes containing the basal agar medium of Hugh and Leifson [J. 
BacterioL 66:24-26 (1953)]. Paraffin oil was added to the top of the medium, and the 

tubes were incubated at 30°C. Tubes were checked for growth and acid formation after 48 
hours and after 5 days. Growth was determined visually. Scoring was done as follows; 

20 good growth (+), poor growth (±), and no growth (-). 

Growth under anaerobic conditions with KNO3 as electron acceptor. Cells were grown for 24 
hours at 28°C on LMG medium 12 plates. A loopfid of cells was taken from the plate and 
streaked onto the agar surface of LMG medium 12 supplemented with 0.1% KNO3. The 
plates were incubated under anaerobic conditions (ca. 10% C0 2 + ca. 90% N 2 ) at 30°C, 

25 and checked for growth after 3 days. Growth was determined as visual growth compared 
to the aerobic (control) condition. Scoring was done (vs. the control) as follows; better 
growth (++), good (equivalent to the control) growth (+), weaker growth (±), poor 
growth (±), and no growth (-). 

Catalase and oxidase reactions. Cells were grown for 24 hours at 28°C on LMG medium 12 
30 plates. A positive result for catalase activity was the production of gas bubbles after 

suspending a colony in one drop of 10% H 2 0 2 . A positive result for oxidase activity was 
the development of a purple-red color after rubbing a colony on filter paper soaked with 
1% tetramethylparaphenylene. 
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Colony pigmentation. Cells were grown for 5 days at 28°C on LMG medium 12. Color of 
colonies was observed visually. 

Cell morphology and motility. Cells were grown for 24 hours at 28°C on LMG medium 12. 
Cell suspensions were made in sterile saline. Cell morphology and motility were observed 
5 microscopically using an Olympus light microscope equipped with phase contrast optics 
(magnification lOOOx). 

The results of the physiological and morphological tests are summarized in Table 11. The 
five strains of the new Paracoccus species responded essentially identically in all physiolo- 
gical and morphological tests performed. The tests that gave identical responses for all five 
10 strains of the new Paracoccus species and that allowed discrimination of these organisms 
from Paracoccus marcusii DSM 11574 and/ or Paracoccus carotinifaciens E-396 were: 

growth at 40°C, growth with 8% NaCl, growth at pH 9.1, and colony pigmentation. 

Zeaxanthin production in strains R-1512, R1534, R114 and R-1506 strains . Strains R- 
1512, R1534, R114, and R-1506 were grown in ME medium, which contains (per liter 
15 distilled water): 5 g glucose, 10 g yeast extract, 10 g tryptone, 30 g NaCl and 5 g 
MgS0 4 -7H 2 0. The pH of the medium was adjusted to 7.2 with 5N NaOH before 
sterilizing by autoclaving. All cultures (25-ml volume in 250-ml baffled Erlenmeyer flasks 

with plastic caps) were grown at 28°C with shaking at 200 rpm. Seed cultures were 
inoculated from frozen glycerolized stocks and grown overnight. Aliquots were 
20 transferred to the experimental flasks to achieve an initial optical density at 660 nm 

(OD 6 6o) of 0.16. Cultures were then grown at 28°C with shaking at 200 rpm. Growth was 
monitored throughout the cultivation and at 6, 10 (or 15 for strain R114), and 24 hours, C 
an aliquot of the culture was removed for analysis of carotenoids by the method described 
in Example 1. 

25 The doubling times of strains R-1512, R1534 and R-1506 under these conditions were 0.85 
hours, 1.15 hours and 1.05 hours, respectively. Strain R114 reproducibly exhibited a bi- 
phasic growth profile; the doubling time of strain R114 in the initial phase was 1.4 hours 
while the doubling time in the second phase was 3.2 hours. 

Table 12 shows the zeaxanthin production and Specific Formation (zeaxanthin production 
30 normalized to OD660) by the Paracoccus sp. strains in ME medium. The data are averages 
of four independent experiments, and within each experiment each strain was tested in 
duplicate flasks. The improved zeaxanthin production in the classically-derived mutant 
strains R1534 and R114 compared to the parental strain R-1512 is clearly shown. Zea- 
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xanthin production by strain R-1506 was approximately the same as strain R-1512. No 
other carotenoids were detected in any of the cultures. 
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Table 11. Physiological characteristics of Paracoccus spp. strains: 12 = R1512; 34 = R1534; 
14 = Rl 14, 06 = R1506; 66 = MBIC3966; 74 = DSM 1 1574 T , 96 = E-396 T , 37 = DSM 6637 T 



Time [days] 


12 


34 


14 


06 


66 


74 


96 


37 




Growth at 10°C 






1 


















5 


±(±) 


±(±) 


±(-> 


±(±) 


±(-) 


±<±) 


±(±) 


±(±) 




Growth at 25°C 


1 


+ 




+ 


+ 


+ (±) 


+ (±) 


+ (±) 


+ (-) 


5 


+ 




+ 


+ 


+ 


+ 


+ 


+ 




Growth at 30°C 


1 


+ 


+ 


+ 




+ 


+ 




+ 


5 


+ 


+ 


+ 


+ 




+ 




+ 




Growth at 33°C 


1 


+ 


+ 


+ 


+ 




+ 


+ 


+ 


Z> 


+ 


+ 


+ 


+ 




+ 


+ 


+ 




Growth at 37°C 


1 
1 


+ 


+ 


+ (±) 


+ 




±(-) 


±(-) 


+ 


c 
O 


+ 


+ 


+ 


+ 


+ 


±(-) 


±(±) 


+ 




Growth at 40°C 




1 


+ 


+ (±) 


+ (-) 


■ + (±) 


±(-) 






+ (*) 


5 


+ 


+ (±) 


+ (-) 


+ 


+ (-) 






+ (*) 




Growth with 3% NaCl 




1 






+ 


+ 


+ 


+ 


+ 


± 


5 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




Growth with 6% NaCl 




1 


+ (±) 


±(±) 


±(±) 


+ 


±(±) 


±(") 


±(-) 




5 


+ 


+ 


+ 


+ 


+ (*) 


+ (±) 


+ (±) 
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o 



c 



Time [days] 


12 


34 


14 


06 


66 


74 


96 


37 1 




Growth with 8% NaCl 


1 


+ (±) 


±(±) 


±(-) 


+(±) 


±(±) 








5 


+ 


+ 




+ 


+ (*) 


±(-) 


±(-) 






Growth at pH 6. 1 


1 


+ 


+ 


+ 


+ 










6 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




Growth at pH 6.3 


1 


+ 


+ 




+ 




± 


+ 


± 


/i 
o 


+ 


+ 


+ 




-f- 


+ 




+ 




Growth at pH 7.0 


1 l 


+ 






+ 


+ 


+ 


+ 

• 




I 6 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




Growth at pH 7.7 


1 


+ 


+ 




+ 


+ 


± 


± 


± 


1 6 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




Growth at pH 8. 1 


1 


+ 


+ 


+ 


+ 


+ 






± 


6 


+ 


+ 


+ 




+ 




+ 


+ 




Growth at pH 9. 1 


1 1 


± 


+ 














fx 

1 


+ 


+ 




+ 






+ 


+ 





Starch hydrolysis 1 























Denitrification 1 


















+ 




Growth in anaerobiosis without electron acceptor added 





















BNSDOCID: <WO. 



02O99095A2_l_> 



WO 02/099095 



PCT/EP02/06171 



12 



34 



14 



-69 - 



06 



66 



74 



96 



Growth in anaerobiosis with glucose added (fermentation) 



37 



















± 




Growth in anaerobiosis with KNO3 added 




















- 




Catalase reaction 








+ 


+ 




+ 


+ 


+ 


+ 


+ 




Oxidase reaction 






+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 




Gram stain 




















- 




Motility 
























Colony pigmentation 






Y-O 


Y-O 


Y-O 


Y-O 


Y-O 


O P 


O-P 


P Y 




Cell morphology 






S toC 


S toC 


StoC 


C 


StoC 


s 


s 


S 




Cell dimensions (pm) 




0.8 x 


0.8 x 


0.8 x 


0.9 x 


0.8 x 


0.8 x 


0.9 x 


0.8 x 




1.2 


1.2 


1.2 


1.1 


1.2 to 


1.5 to 


2.0 to 


1.5 to 












1.5 


2.0 


2.5 


2.0 



( 



short rod; C: coccoid 
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Table 12. Zeaxanthin production by Paracoccus sp. strains R-1512, R1534, Rl 14 and R- 
1506. 







Zeaxanthin (mg/1) 


Specific Formation 
(mg zeaxanthin/ODseo) 


Strain 


Time (h) 


/Wcragc 


oianaara ueviation 


Average 


Standard Deviation 


R-1512 


6 




U.1U 


0.10 


0.04 




10 

X V 




U./U 


0.25 


0.08 




24 


3.78 


0.59 


0.38 


0.06 


R1534 


6 


u. 


n in 


U.zo 


U.Uz 




10 


3.45 


0.57 


0.43 


0.07 




24 


9.13 


0.97 


0.95 


0.06 


R114 


6 


0.65 


0.17 


0.86 


0.24 




15 


7.53 


1.12 


1.13 


0.21 




24 


19.7 


1.82 


2.68 


0.20 


R-1506 


6 


0.13 


0.06 


0.07 


0.01 




10 


1.35 


0.31 


0.19 


0.04 




24 


3.55 


0.68 


0.38 


0.07 | 



c 



Example 3: IPP Biosynthesis via the Mevalonate Pathway in the Zeaxanthin-Producing 
5 Paracoccus sp. strain Rl 14. 

In order to determine the biosynthetic origin {i.e., the mevalonate or DXP pathway) of 
isoprenoid precursors in Paracoccus sp. strain R114, a "retrobiosynthesis" approach 
[Eisenreich and Bacher, In: Setlow (ed.), Genetic Engineering, Principles and Methods, 
Kluwer Academic/Plenum Publishers, New York 22:121-153 (2000)] was taken. This 

10 predictive approach for data analysis permits the unequivocal assessment of glucose 
catabolism from the analysis of a single down-stream natural product. In the present 
work, this involved growth of the bacterium in media containing various binary mixtures 
of unlabeled glucose and specific 13 C-labeled glucoses, followed by purification of the 
zeaxanthin produced and analysis of the labeling patterns by NMR spectroscopy. Details 

15 of the methods used and the experimental results are given below. 

Growth of Paracoccus sp. strain Rl 14 for l3 C labeling experiments . Unlabelled D-glucose 
monohydrate was purchased from Fluka (Milwaukee, WI, USA). (U- i3 C 6 ]-D-Glucose was 
purchased from Isotec (Miamisburg, OH, USA), while [l- 13 Ci] D-glucose, [2- l3 Q] D- 
glucose and [6- d] D-glucose were from Cambridge Isotope Laboratories (Andover, MA, 
20 USA). Yeast extract and peptone (from casein, pancreatically digested) were purchased 
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from EM Science (Gibbstown, NJ, USA). All other salts and solvents were analytical grade 
and were purchased from standard chemicals suppliers. 

All cultures were initiated from frozen cell suspensions (cell density of 12 OD 66 o units, 25% 
glycerol, stored at -70°C). One ml of thawed cell suspension was used to inoculate pre- 

5 cultures (500-ml baffled shake flasks) containing 100 ml of 362F/2medium having the 
following composition: 30 g/1 D-glucose, 10 g/1 yeast extract, 10 g/1 peptone, 5 g/1 NaCl, 
2.5 g/1 MgS0 4 -7H 2 0, 0.75 g/1 (NH 4 ) 2 HP0 4 , 0.625 g/1 K 2 HP0 4 , 187.5 mg/1 CaCl 2 -2H 2 0, 0.2 
g/1 (NH 4 ) 2 Fe(S0 4 ) 2 -6H 2 G\ 15 mg/1 ZnS0 4 -7H 2 0, 12.5 mg/1 FeCl 3 -6H 2 0, 5 mg/1 
MnS0 4 H 2 0, 0.5 mg/1 NiS0 4 -6H 2 0, 15 mg/1 Na-EDTA and 9.375 ul/1 HC1 (37% stock 

10 solution). The initial pH of the medium was 7.2. 



30 



f 



The pre-culture was incubated at 28°C with shaking at 200 rpm for 24 h, after which time 
the ODeeo was about 22 absorbance units. The main cultures were grown in Bioflo 3000 
bioreactors (New Brunswick Scientific, Edison, NJ, USA) containing 362F/2 medium 

13 

containing the following composition: 30 g/1 total D-glucose (see below for ratios of C- 
1 5 labeled-.unlabeled glucose), 20 g/1 yeast extract, 10 g/1 peptone, 10 g/1 NaCl, 5 g/1 

MgS0 4 -7H 2 0, 1.5 g/1 (NH 4 ) 2 HP0 4 , 1.25 g/1 K 2 HP0 4 , 0.4 g/1 (NH 4 ) 2 Fe(S0 4 ) 2 -6H 2 0, 375 
mg/1 CaCl 2 -2H 2 0, 30 mg/1 ZnS0 4 -7H 2 0, 25 mg/1 FeCl 3 -6H 2 0, 10 mg/1 MnS0 4 H 2 0, 1 mg/1 
NiS0 4 -6H 2 0, 30 mg/1 Na-EDTA and 18.75 ul/1 HC1 (37% stock solution). The amounts of 
each 13 C-labeled glucose used (expressed as a percentage of the total 30 g/1 glucose in the 
20 medium) in four separate experiments were: Condition 1, 4% [U- C 6 ] D-glucose; Con- 
dition 2, 50% [1- 13 Q] D-glucose; Condition 3, 25% [2- 13 C,] D-glucose + 1% [U- Q] D- 
glucose; Condition 4, 25% [6- l3 C,] D-glucose + 1% [U- ,3 Q] D-glucose. A control with! 
only unlabeled glucose was also included. For Conditions 1 and 2 (and the unlabeled 
control), the culture volume was 2 1, while the culture volume for Conditions 3 and 4 was 1 
25 L The bioreactors were inoculated with pre-culture (20 ml/1 initial volume) and cultiva- 
tion proceeded for 22-24 hours, at which time no glucose was left in the medium. Cultiva- 
tion conditions were: 28°C, pH 7.2 (controlled with 25% H 3 P0 4 and 28% NH 4 OH), dis- 
solved oxygen controlled (in a cascade with agitation) at a minimum of 40%, agitation rate 
and aeration rate 300 rpm (minimum) and 1 wm, respectively. 



Purification of zeaxanthin . At the end of the cultivations, the cultures were cooled down 
to 15°C Five hundred ml of absolute ethanol was added per liter of culture and stirring 
was continued at 100 rpm for 20 min. The treated culture was centrifuged for 20 min. at 
5000 x g, and the supernatant was discarded. The wet pellet was then extracted with 5 
volumes of THF for 20 min. with stirring. The extracted mixture was centrifuged, the 
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supernatant saved and the resulting pellet extracted a second time with 1 volume THF 
under the same conditions and again centrifuged. The supernatants (extracts) were com- 
bined and concentrated to 50 ml by rotary evaporation. Five milliliters of hexane were 
added to the concentrated THF solution. After mixing, the system formed an emulsion 
5 that could be separated by centrifugation. The aqueous phase was collected, diluted with 
an equal volume of saturated NaCl solution and re-extracted with dichloromethane. The 
dichloromethane phase was collected and combined with the THF/hexane phase. The 
mixture of organic extracts was concentrated again in a rotary evaporator to remove di- 
chloromethane. The solution was then applied to a silica gel column and eluted with a 

10 mixture of n-hexane and ether (1:1). A small light yellow band eluted first and was dis- 
carded. The main zeaxanthin product eluted in a broad band that moved slowly in the 
column. About 2 liters of solvent was needed to elute the main band completely. The 
eluate was collected in a round-bottomed flask and the solvent was removed by rotary eva- 
poration at 40°C. The residue was dissolved in a small amount of dichloroethane at 40°C 

15 and the solution was then allowed to cool slowly. Hexane was added to the mixture drop- 
wise until a turbidity was observed. The crystallization was complete within 48 hours at 

4°C. The crystals were collected on a paper filter, washed with cold methanol and dried 
under vacuum. 

NMR studies. Zeaxanthin was analyzed by NMR spectroscopy. For reference, the chemi- 
20 cal structure of zeaxanthin is illustrated in the following formula 




l H-NMR and 13 C-NMR spectra were recorded at 500.13 MHz and 125.6 MHz, respective- 
ly, with a Bruker DRX 500 spectrometer. Acquisition and processing parameters for one- 
dimensional experiments and two-dimensional INADEQUATE experiments were accord- 
25 ing to standard Bruker software (XWINNMR). The solvent was deuterated chloroform. 
The chemical shifts were referenced to solvent signals. 

13 C NMR spectra of the isotope labeled zeaxanthin samples and of the zeaxanthin sample 
at natural 13 C abundance were recorded under the same experimental conditions. 
Integrals were determined for every I3 C NMR signal, and the signal integral for each 
30 respective carbon atom in the labeled compound was referenced to that of the natural 

abundance material, thus affording relative C abundances for each position in the labeled 
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molecular species. The relative abundances were then converted into absolute abundances - 
from I3 C coupling satellites in the ] H NMR signal of H- 18 at 1.71 ppm. In the l3 C NMR 
spectrum of the multiply-labeled zeaxanthin sample each satellite was integrated 
separately. The integral of each respective satellite pair was then referenced to the total 
5 signal integral of a given carbon atom. Zeaxanthin comprises a total of eight isoprenoid 
moieties (2 DMAPP units and 6 IPP units); only 20 13 C NMR signals are observed due to 
chemical shift degeneracy. 

In the experiment with the mixture of [U- 13 C 6 ] glucose and unlabeled glucose (1:7.5; w/w), 
all carbon atoms of zeaxanthin were labeled and showed satellites due to 13 C 13 C couplings 

10 (Table 13). The signals of 4 carbon atoms have intense satellites due to C C couplings 
(61 .2 ± 0.6 % in the global NMR signal intensity of a given atom, Table 13). The signal / 
accounting for the methyl atoms C-17/C-17' displayed only weak C-coupled satellites at 
a relative intensity of 6%. The central signals represent material derived from unlabeled 
glucose. The signals showed no evidence of long-range coupling. Carbon connectivity 

15 was easily gleaned from 13 C 13 C coupling constants (Table 13) and from two-dimensional 
INADEQUATE experiments. 

Three of the carbon atoms acquired label from [6- 13 Ci] glucose. The other two carbons 
were labeled from [2- 13 Ci] glucose. No significant amounts of label were contributed to 
zeaxanthin by [l- 13 Ci] glucose. 

20 The 13 C abundance for all non-isochronous carbon atoms was determined by comparison 

its 

with spectra of unlabeled zeaxanthin and by evaluation of the H C coupling satellites in 
l H NMR spectra (Table 13). The fraction of jointly transferred carbon atom pairs in the ^ 
experiment with [U- i3 C6] glucose was determined by integration of the coupling satellites. 

The labeling patterns of the IPP building block can be reconstructed accurately as shown 
25 by the standard deviations found for the reconstructed IPP precursor. The reconstructed 
labeling patterns of DMAPP and IPP were identical within the experimental limits. 
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Table 13. NMR results for 13 C labeled zeaxanthin produced by Paracoccus sp. strain Rl 14 
supplied with l3 C labeled glucoses. 



Position 


6 - ,3 C 
ppm 


Jcc. Hz 


1J C-labeled glucose precursor 








fl- ,3 Cl- 


f2- ,3 Cl- 


f6- l3 Cl- 


[U- ,3 C 6 ] glucose 




134.08 


44.2 (18, 18') 


% ,3 C 


% I3 C 


% ,3 C 


/VI 




1,1' 


37.13 


36.0(16,16') 


1 10 


10 71 


2 22 


^ 47 


61 2 

O X -Z. 


2, 2' 


48.46 


35.8 (3,3') 


1 20 


2 S8 


10 27 


^ 65 


61 1 


3,3' 


65.10 


35.8 (2,2') 


1 12 


12 47 


2 ^8 


^ 64 


6f) 4 


4,4' 


42.57 


37.1 


1 27 

X .Z. / 


2 59 


10 6^ 


^ 8Q 


8 4 


5, 5' 


126.17 


44.2(18, 18') 


1 14 

i . x*± 


1 2 45 


^ 19 


^ 68 


61 1 


6, 6' 


137.77 


56.4 (7,7') 


1 ^0 


2 1 5 


9 Q8 




6fl 4 


7, 7' 


125.59 


56.2 (6,6') 


1 12 
X* LZ, 


10 11 


2 82 


4 HQ 


at 4 i 


8, 8' 


138.50 


71.6, 55.7 


1 28 


2 24 






4 ^ ^ n 


9, 9' 


135.69 


43.1 (19,19') 


1 12 




2 95 


^ 84 


61 7 
OX./ 


10, 10' 


131.31 


59.7(11,11') 


1 21 


^ 18 


9 fil 

7. ox 


3 80 


61 1 


11, 11' 


124.93 


59.7(10,10') 


1 10 


8 79 


2 70 


4.02 


61.0 


12, 12' 


137.57 


70.5 


1.20 


2.01 


8.80 


3.59 


5.1 


13, 13' 


136.48 


43.1 (20,20') 


1.12 


9.86 


3.59 


3.93 


61.4 


14, 14' 


132.60 


60.4(15,15') 


1.21 


2.83 


10.51 


3.77 


59.5 


15, 15' 


130.08 


60.4(14,14') 


1.12 


9.18 


3.33 


4.02 


61.2 


16, 16' 


30.26 


36.3 (1,1') 


1.27 


3.19 


12.31 


3.91 


62.0 


17, 17' 


28.73 


34.9(1,1') 


1.30 


3.43 


12.31 


3.88 


6.0 


18, 18' 


21.62 


44.2 (5,5') 


1.27 


3.01 


11.66 


3.70 


62.0 


19, 19' 


12.82 


43.1 (9,9') 


1.29 


3.12 


11.64 


3.86 


62.3 


20, 20' 


12.75 


42.9(13,13') 


1.33 


3.21 


11.99 


3.75 


62.1 



The experimental labeling patterns determined above can be compared with various pre- 
5 dictions, taking into account not only the mevalonate pathway vs. the DXP pathway for 
isoprenoid biosynthesis, but also different pathways of glucose metabolism. Eubacteria 
typically utilize glucose primarily via the glycolytic pathway or via the Entner-Doudoroff 
pathway. Glycolysis generates two triose phosphate molecules from glucose. The C-l and 
C-6 of glucose are both diverted to the 3-position of the triose phosphates produced 
10 during glycolysis. On the other hand, in the Entner-Doudoroff pathway, glucose is con- 
verted to a mixture of glyceraldehyde 3-phosphate and pyruvate. The C-l of glucose is 
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exclusively diverted to C-l of pyruvate, and the C-6 of glucose is exclusively diverted to C- 
3 of glyceraldehyde 3-phosphate. 

Intermediates and products of the glycolytic and Entner-Doudoroff pathways serve as 
starting material for both isoprenoid biosynthetic pathways. With regard to the mevalon- 
5 ate pathway, pyruvate as well as triose phosphate can be converted to the precursor acetyl- 
CoA. Glucose catabolism via the glycolytic pathway diverts label from C-l as well as C-6 
of glucose to the methyl group of acetyl-CoA. Glucose catabolism via the Entner-Doudo- 
roff pathway results in loss of C-l from glucose during the transformation of pyruvate to 
acetyl-CoA. 

10 The experimentally observed enrichment and C C coupling patterns of the zeaxanthin 
produced by Paracoccus sp. strain Rl 14 were in perfect agreement with the labeling patter' 
required for zeaxanthin biosynthesis by the combination of the Entner-Doudoroff 
pathway and the mevalonate pathway. If both the glycolytic and Entner-Doudoroff 
pathways had been simultaneously operative under the experimental conditions used, at 

15 least some label from [ l- 13 Ci] glucose should have been contributed to the zeaxanthin. 
Furthermore, the mevalonate pathway can at best contribute blocks of two carbon atoms 
to terpenoids, while in the DXP pathway three carbon units can be delivered to 
isoprenoids via triose phosphate precursors. Although such three-carbon blocks become 
separated by the rearrangement involved in the DXP pathway, blocks of three labeled 

20 carbon atoms can still be recognized via long-range coupling. Corresponding 13 C- l3 C 

long-range couplings have been observed in the biosynthesis of the carotenoid lutein from 
[2,3,4,5- 13 C 4 ] 1-deoxy-D-xylulose by cultured plant cells (Cantharantus roseus) [Arigoni et 
aL, Proc. Nat. Acad. Sci. 94:10600-10605 (1997)]. No such long-range coupling was ( 
observed in the present experiments with zeaxanthin produced by Paracoccus sp. strain 

25 R114. 

It should be noted that while the results presented here confirm isoprenoid production in 
Paracoccus sp. strain R114 via the mevalonate pathway, and indicate that, under the growth 
conditions used, there was little or no glucose metabolism via glycolysis, they do not rule 
out the possibility of some metabolism of glucose via the pentose phosphate pathway in 
30 addition to the Entner-Doudoroff pathway. Quantitative determination of glucose meta- 
bolism via the latter two pathways could be obtained by analysis of labeling patterns of 
pyruvate-derived amino acids (as was done for Paracoccus denitrificans [Dunstan et al., 
Biomedical and Environ. Mass Spectrometry 19:369-381 (1990)]. 
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Example 4: Cloning and Sequencing of the Genes Encoding IPP Isomerase and the 

Enzymes of the Mevalonate Pathway from Paracoccus sp. strain R114. 
Culture conditions . Paracoccus sp. strain Rl 14 was grown at 28°C in F-medium (10 g/1 
tryptone, 10 g/1 yeast extract, 30 g/1 NaCl, 10 g/1 D-glucose, 5 g/1 MgS0 4 -7H 2 0, pH 7.0) or 
5 in the pre-culture medium described in Example 3 above. Liquid cultures were grown in a 
rotary shaker at 200 rpm. 

Isolation of genomic DNA . A 600-ml culture of Paracoccus sp. strain Rl 14 was centrifuged 
for 10 minutes at 10,000 x g at 4°C and the pellet was washed once with 200 ml lysis buffer 
(0.1M NaCl, 50mM EDTA, lOmM Tris-HCl, pH 7.5) and once with 100 ml lysis buffer. 
10 The final pellet was resuspended in 20 ml lysis buffer containing 50 mg lysozyme and 1 mg 
RNase A (DNase free). After incubation for 15 minutes at 37°C, 1.5 ml of 20% sodium N- 
lauroyl-sarcosinate and 2.25 mg of proteinase K were added. After incubation at 50°C for 
30-60 minutes, the lysate was extracted with one volume of buffer-saturated phenol, pH 
7.5-7.8 (LifeTechnologies, Rockville, MD, USA) by gentle but thorough mixing. The 
1 5 emulsion was centrifuged for 20 minutes at 30,000 x g and the aqueous phase was re- 
extracted with phenol. The phases were separated as before and the aqueous phase was 
extracted twice with one volume phenohchloroform (1:1). At this step centrifugation for 
20 minutes at 3,200 x g in a swinging bucket rotor was sufficient to obtain satisfactory 
phase separation. After a final extraction with one volume of chloroform, 0.1 volume 3M 
20 sodium-acetate (pH 5.2) was added and the solution was overlaid with 2 volumes ice-cold 
ethanol. The precipitated DNA was spooled with a glass-rod, soaked in 70% ethanol for 5 
minutes, rinsed with chloroform and then air dried for 5-10 minutes. The DNA was re- 
suspended overnight in 5 ml TE (lOmM Tris-HCl, pH 7.5, ImM EDTA). Since the solu- 
tion was yellow due to traces of zeaxanthin, the organic extractions and the spooling were 
25 repeated as above to obtain a clear preparation. 

Isolation ofX-DNA : The Qiagen* Lambda Kit (Qiagen, Hilden, Germany) was used 
following the manufacturer's instructions. 

Polymerase chain reaction (PCR) : Oligonucleotides were purchased from LifeTechn- 
ologies (Rockville, MD, USA). PCR was performed in a GeneAmp* PCR system 9700 (PE 
30 Applied Biosystems, Foster City, CA, USA) using the GC-rich PCR system (Roche Molecu- 
lar Biochemicals, Mannheim, Germany) according to the manufacturers instructions. 
Typically, the MgCl 2 concentration used was 1.5mM and the resolution solution was 
added to 1M final concentration. 

DNA Labeli n g and detection : The PCR DIG Probe Synthesis Kit and the DIG 
35 Luminescent Detection Kit were used for DNA labeling and detection, respectively (both 
obtained from Roche Molecular Biochemicals, Mannheim, Germany) 



r 
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DNA sequencing: Sequencing reactions were performed using the BigDye* DNA sequenc- 
ing kit (PE Applied Biosystems, Foster City, CA, USA) according to the manufacturers in- 
structions. Sequencing reactions were purified on DyeEx™ spin columns (Qiagen, 
Hilden, Germany) and fragment separation and detection was done with an ABI Prism™ 
5 310 Genetic Analyzer (PE Applied Biosystems, Foster City, CA, USA). 

X-library : A custom made library with partially Saw3AI digested Paracoccus sp. strain Rl 14 
DNA in lambda FIX* II was purchased from Stratagene (La Jolla, CA, USA). 
Cloning, sequencing and characteriz ation of the mevalonate pathway gene cluster from 
Paracoccus sp. strain Rl 14. One of the enzymes of the mevalonate pathway, mevalonate 
10 diphosphate decarboxylase, contains highly conserved regions spanning several amino 
acids. Three such regions were chosen from an alignment of all available eubacterial 
mevalonate diphosphate decarboxylases and oligonucleotides were designed using the ( 
preferred codon usage found in the carotenoid gene cluster of Paracoccus sp. strain R1534 
(Table 14). 

15 The oligonucleotides designed from two homology regions are shown in Table 15. To 

reduce the degree of degeneracy, sets of oligonucleotides were designed from each peptide. 
For instance, oligonucleotides mvd-103a-d differ only in the third nucleotide from the 3' 
end, each accounting for one possible codon for glycine (GGA, although rarely used, was 
included because of the close proximity to the 3' end). Alternate amino acids were 
20 accounted for by designing oligonucleotides to both residues, e.g. oligonucleotides mvd- 
101a and mvd-lOlb are specific for leucine or isoleucine, respectively, in the second 
position of peptide 1 (Table 15). PCR with oligonucleotides mvd-101 and mvd-104 or 
mvd-106, using Paracoccus sp. strain 1 14 DNA as template, gave a product of the expected 
size. The PCR product was cloned in the vector pCR*2. 1 -TOPO (Invitrogen, Carlsbad, ( 
25 CA, USA) and sequenced. The cloned fragment was used as a probe for a Southern 

analysis of Paracoccus sp. strain R114 DNA and was found to hybridize to a BamHl-Sall 
fragment of about 950 bp. Paracoccus sp. strain Rl 14 DNA was cut with BamHl and Sa/I 
and the fragments were separated by agarose gel electrophoresis. The region around 950 
bp was isolated and cloned in the vector pUCl9. This partial library was then screened 
30 using the tnvd-PCR fragment as a probe and the insert of a positive clone was sequenced. 
In parallel, a X-library prepared from Paracoccus sp. strain Rl 14 DNA was screened using 
the mvd-PCR fragment as a probe. DNA was isolated from two positive ^-clones and cut 
with BamHl and Sail or EcoRI and Sail. A number of the restriction fragments were isola- 
ted and cloned in the vector pUC19. Several of the fragments contained sequences homo- 
35 logous to genes encoding proteins of the mevalonate pathway. The clones connecting these 
individual sequences were obtained by PCR with primers derived from the sequences of 
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the cloned restriction fragments using the DNA of the A-clones as template. The 
assembled sequence from all fragments (SEQ ID NO:42, 44, 46, 48, 50, and 52) and the 
sequences of the encoded proteins are shown in the Sequence Listing (SEQ ID Nos:43, 45, 
47, 49, 51, and 53). Due to a limitation of the Patentln Program, operons with overlapping 
5 genes cannot be shown as a single sequence. Thus, for each gene in the mevalonate 
operon, the entire nucleotide sequence of the operon is repeated for each gene. 
Accordingly, SEQ ID Nos:42, 44, 46, 48, 50, and 52 are identical. For purposes of the 
present invention, we use SEQ ID NO:42 to refer to the nucleotide sequence of the 
mevalonate operon. 

10 The arrangement of the mevalonate pathway genes in the Paracoccus sp. strain Rl 14 is uni- 
que when compared to known mevalonate gene clusters of other bacteria. Besides Para- 
coccus sp. strain Rl 14, only Borrelia burgdorferi and Streptomyces sp. strain CL190 (Takagi 
et al., supra) have all mevalonate genes in a single operon (Wilding et al., supra). In 
Streptococcus pyrogenes all mevalonate genes are clustered in a single locus but they are 

15 grouped in two operons. All other species have two loci with the two kinases and the 
mevalonate diphosphate decarboxylase grouped in one operon and the HMG-CoA syn- 
thase and the HMG-CoA reductase on a second locus, either forming an operon (in 
Streptococcus pneumoniae) or as separate transcription units. All species except the 
members of Staphylococcus have an additional gene linked with the mevalonate cluster, 

20 which was recently identified as an IPP isomerase (idi gene in Streptomyces sp. strain 
CL190) (Kaneda et al., supra). The two Enterococcus species and Staphylococcus haemo- 
lyticus have an acetyl-CoA acetyltransferase gene linked with the HMG-CoA reductase 
gene. In the Enterococcus species the latter two genes are fused. 
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Table 14: Codon usage in Paracoccus sp. strain R1534 carotenoid (crt) genes 



1 Amino acid 


Codon 


1 Number used 


1 % Used 




A -Ala 


GCT 


3 


1.4 






GCC 


96 


46.2 




I 


GCA 


15 


7.2 






GCG 


94 


45.2 




C - Cys 


TGT 


0 


0.0 






TGC 


15 


100.0 




D - Asp 


GAT 


46 


38.0 






GAC 


75 


62.0 




E-Glu 


GAA 


17 


25.4 






GAG 


50 


74.6 


_ 

/ 


F - Phe 


TTT 


3 


5.6 






TTC 


51 


94.4 




G-GIy 


GGT 


16 


10.8 






GGC 


87 


58.8 






GGA 


5 


3.4 






GGG 


40 


27.0 




H - His 


CAT 


30 


| 56.6 






CAC 


23 


43.4 




I -He 


ATT 


5 j 


6.4 






ATC 


72 


92.3 






ATA 


1 


1.3 


/ 


K-Lys 


AAA 


4 


14.3 


c 




AAG 


24 


85.7 




L - Leu 


TTA 


0 


0.0 






TTG 


5- 


2.9 






CTT 


15 


8.7 








1 1 

1 


6.4 






CTA 


1 


0.6 






CTG 


140 


81.4 




M - Met 


ATG 


49 


100.0 


N - Asn 


AAT 


4 


20.0 






AAC 


16 


80.0 
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Amino acid 


Codon 


Number used 


% Used 


r — rro 


CCT 


2 


2.3 




ccc 


41 


47.7 




CCA 


3 


3.5 






40 


46.5 


Q — Gin 


f"A A 


6 


11.3 






47 


88.7 


R-Arg 


L-,Lj 1 


1 1 

J. A 


7.3 




CGC 

^w^* ^""^ 


103 


68.2 






2 


1.3 






26 


17.2 




A Z"" 1 A 

AvjA 


2 


1.3 




AvjVj 


7 


4.6 


S-Ser 


LG 1 


i 

i 


1 1 




TCC 


17 


19.5 






n 


6.0 






^9 


44.8 






o 


2.3 




AGC 


28 


32.2 


T-T.hr 




2 


27 




i AL^ 


36 


48.9 




ATA 
AL.A 




5.3 




ACG 


33 


44.0 


V-Val 


GTT 


6 


5.7 




GTC 


61 


57.5 




GTA 


1 


0.9 




GTG 


38 


35.8 


W-Trp 


TGG 


27 


100.0 


Y-Tyr 


TAT 


28 


62.2 




TAC 


17 


37.8 
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Table 15: Oligonucleotides designed from two conserved bacterial Mvd peptides. 



SEQ 



Peptide 1 

Nucleotide sequence 

Oligonucleotide mvd-lOla (5' - 3') 
Oligonucleotide mvd- 10 lb (5' - 3') 
Oligonucleotide mvd- 103a (5' - 3') 
Oligonucleotide mvd- 103b (5' - 3') 
Oligonucleotide mvd- 103c (5' - 3') 
Oligonucleotide mvd- 103d (5* - 3') 



AlaLeuIleLysTyrTrpGlyLys 
He 2 

CCS CTG ATC AARTAYTGGGGB AARATC 



GC S CTG ATC AARTAYTGGGG 

GC S ATC ATC AARTAYTGGGG 

ATCAARTAYTGGGGTAA 

ATCAARTAYTGGGGCAA 

ATCAARTAYTGGGGGAA 

ATCAARTAYTGGGGAAA 



23 



24 



25 
26 
27 
28 
29 
30 



Peptide 2 



Nucleotide sequence (5'-3') 



Complement (3'-5') 



Oligonucleotide mvd 
Oligonucleotide mvd- 
Oligonucleotide mvd- 
Oligonucleotide mvd- 
Oligonucleotide mvd- 
Oligonucleotide mvd- 
Oligonucleotide mvd- 
Oligonucleotide mvd- 



104a (3* 
104b (3' 
104c (3' 
104d (3' 
106a (3* 
106b (3' 
106c (3' 
106d (3' 



5') 
■5*) 
5') 
5') 

5') 
5') 
5') 
5') 



ThrMe t AspAlaG ly Pr oAsnVa 1 

Gin 2 

ACSATGGAYGCSGGBCCSAAYGTS 

CAR 

TGSTACCTRCGSCCVGGSTTRCAS 

GTY 

TGGTACCTACGSCCVGG 

TGGTACCTGCGSCCVGG 

TGCTACCTACGSCCVGG 

TGCTACCTGCGSCCVGG 

TACCTACGSCCVGGSTTRCA 
TACCTGCGSCCVGGSTTRCA 
TACCTACGSCCVGGSGTYCA 
TACCTGCGSCCVGGSGTYCA 



31 



32 



33 



34 
35 
36 
37 
38 
39 
40 
41 



a : SEQ ID NO: 

1 using the preferred codons of Paracoccus sp. strain R1534, see table 1 

2 alternate amino acid present in some enzyme 

S = C or G; R = A or G; Y = C or T; B = C or G or T; V = A or C or G 

The genes of the mevalonate operon from Paracoccus sp. strain R114 were identified by 
homology of the gene products to proteins in general databases. An amino acid alignment 
of the HMG-CoA reductase from Paracoccus sp. strain R114 (SEQ ID NO:43) was per- 
formed with bacterial class I HMG-CoA reductases of Streptomyces sp. Strain CL190 (SEQ 
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ID NO:54), S. griseolosporeus (SEQ ID NO:55), and Streptomyces sp. strain KO-3899 (SEQ 
ID NO:56). EMBL/GenBank/DDBJ database accession numbers are q9z9n4 for Strepto- 
myces sp. strain CL190, q9znhl for S. griseolosporeus and q9znh0 for Streptomyces sp. strain 
KO-3899. There are two classes of HMG-CoA reductases [Bochar et al., Mol. Genet. 
5 Metab. 66:122-127 (1999); Boucher et al., Mol. Microbiol. 37:703-716 (2000)]. Eubacte- 
rial HMG-CoA reductases are generally of class II, whereas class I enzymes are found in 
eukaryotes and archaea. The Streptomyces and the Paracoccus HMG-CoA reductases 
together with the enzyme from Vibrio cholerae are the only eubacterial HMG-CoA 
reductases of class I known so far. 

10 An amino acid alignment of isopentenyl diphosphate isomerase (IPP isomerase) (idi) from 
Paracoccus sp. strain Rl 14 (SEQ ID NO:45) was performed with close homologs found in 
the EMBL database, i.e. Erwinia herbicola (Q01335) (SEQ ID NO:57), Borrelia burgdorferi 
(051627) (SEQ ID NO:58), Synechocystis sp. PCC 6803 (P74287) (SEQ ID NO:59), 
Streptomyces sp. CL190 (Q9KWG2) (SEQ ID NO:60), Streptomyces griseolosporeus 

15 (Q9KWF6) (SEQ ID NO:61 ), Sulfolobus solfataricus (P95997) (SEQ ID NO:62), Rickettsia 
prowazekii (Q9ZD90) (SEQ ID NO:63), Deinococcus radiodurans (Q9RVE2) (SEQ ID 
NO:64), Aeropyrum pernix (Q9YB30) (SEQ ID NO:65), Halobacterium sp. NRC-1 
(054623) (SEQ ID NO:66), Archaeoglobus fulgidus (027997) (SEQ ID NO:67), Pyrococcus 
abyssi (Q9UZS9) (SEQ ID NO:68), Pyrococcus horikoshii (058893) (SEQ ID NO:69), 

20 Methanobacterium thermoautotrophicum (026154) (SEQ ID NO:70), Methanococcus 

jannaschii (Q58272) (SEQ ID NO:71), Thermoplasma acidophilum (CAC11250) (SEQ ID 
NO:72) and Leishmania major (Q9NDJ5) (SEQ ID NO:73). EMBL/GenBank/DDBJ data- 
base accession numbers are given after the organism's name in parentheses. The first nine 
sequences are from eubacteria and the next eight sequences are from archaea. Interesting- 

25 ly, one eukaryotic species, the protozoan parasite Leishmania major (SEQ ID NO:73), also 
has a protein that is highly homologous. This is unexpected because other eukaryotes have 
a different idU designated type 1 (Kaneda et al., supra). A conserved hypothetical protein 
from Bacillus subtilis, YpgA, also has substantial homology but is considerably smaller than 
the type 2 idi's. 

30 An amino acid alignment of bacterial HMG-CoA synthase from Paracoccus sp. strain Rl 14 
(SEQ ID NO:47) was performed with close homologs found in the EMBL database, i.e. 
Streptococcus pneumoniae (AAG02453) (SEQ ID NO:74), Streptococcus pyrogenes 
(AAG02448) (SEQ ID NO:75), Entereococcus faecalis (AAG02438) (SEQ ID NO:76), 
Enterococcus faecium (AAG02443) (SEQ ID NO:77), Staphylococcus haemolyticus 

35 (AAG02427) (SEQ ID NO:78), Staphylococcus epidermis (AAG02433) (SEQ ID NO:79), 
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Staphylococcus aureus (AAG02422) (SEQ ID NO:80), Staphylococcus carnosus (Q9ZB67) 
(SEQ ID NO:81), Streptomyces sp. CL190 (Q9KWG1) (SEQ ID NO:82), Streptomyces 
griseolosporeus (Q9KWF5) (SEQ ID NO:83) and Borrelia burgdorferi (051626) (SEQ ID 
NO:84). EMBL/GenBank/DDBJ database accession numbers are given after each 
5 organism's name in parentheses. The first 43 amino acids of the sequence from Strept- 
omyces griseolosporeus are missing in the database version. 

An amino acid alignment of bacterial mevalonate diphosphate decarboxylase from Para- 
coccus sp. strain Rl 14 (SEQ ID NO:53) was performed with the orthologous proteins from 
other bacteria, i.e. Streptococcus pneumoniae (AAG02456) (SEQ ID NO:85), Streptococcus 

10 pyrogenes (AAG02451) (SEQ ID NO:86), Entereococcus faecalis (AAG02441) (SEQ ID 

NO:87), Enterococcus faecium (AAG02446) (SEQ ID NO:88), Staphylococcus haemolyticus / 
(AAG02431) (SEQ ID NO:89), Staphylococcus epidermis (AAG02436) (SEQ ID NO:90), ~ 
Staphylococcus aureus (AAG02425) (SEQ ID NO:91), Streptomyces sp. CL190 (Q9KWG4) 
(SEQ ID NO:92), Streptomyces griseolosporeus (Q9KWF8) (SEQ ID NO:93) and Borrelia 

15 burgdorferi (051629) (SEQ ID NO:94). EMBL/GenBank/DDBJ database accession numbers 
are given after each organism's name in parentheses. 

Two proteins from Myxococcus xanthus y Tac and Taf (database accession numbers q9xb06 
and q9xb03, respectively) and a protein from B. subtilis y PksG, a putative polyketide bio- 
synthesis protein (database accession number p40830)> have substantial homology to the 

20 Paracoccus sp. strain R114 HMG-CoA synthase. The homology between the Paracoccus sp. 
strain Rl 14 HMG-CoA synthase and the Tac and Taf proteins of the M. xanthus is greater 
than the homology between the HMG-CoA synthases from Paracoccus sp. strain R114 and 
eukaryotes. The bacterial HMG-CoA synthases and the bacterial mevalonate diphosphate^ 
decarboxylases share substantial homology with their eukaryotic orthologs. Archaeal 

25 HMG-CoA synthases form a more distantly related group of enzymes (Wilding et al., 
supra) and no mevalonate diphosphate decarboxylase orthologs are found in archaea 
[Smit and Mushegian, Genome Res. 10:1468-1484 (2000)]. 

Alignments of the mevalonate kinase (Mvk) (SEQ ID NO:49) and the phosphomevalonate 
kinase (Pmk) (SEQ ID NO:51) from Paracoccus sp. strain Rl 14 were performed to the 

30 orthologous proteins from other bacteria, i.e. Streptococcus pneumoniae (AAG02455) (SEQ 
ID NO:95), Streptococcus pyrogenes (AAG02450) (SEQ ID NO:96), Entereococcus faecalis 
(AAG02440) (SEQ ID NO:97), Enterococcus faecium (AAG02445) (SEQ ID NO:98), 
Staphylococcus haemolyticus (AAG02430) (SEQ ID NO:99), Staphylococcus epidermis 
(AAG02435) (SEQ ID NO:100), Staphylococcus aureus (AAG02424) (SEQ ID NO:101), 

35 Streptomyces sp. CL190 (Q9KWG5) (SEQ ID NO: 102), Streptomyces griseolosporeus 

RNgprr^p- <-wn Q2099095A2 I > . 
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(Q9KWF9) (SEQ ID NO:103) and Borrelia burgdorferi (051631) (SEQ ID NO:104)(Mvk); 
and Streptococcus pneumoniae (AAG02457) (SEQ ID NO:105), Streptococcus pyrogenes 
(AAG02452) (SEQ ID NO: 106), Entereococcus faecalis (AAG02442) (SEQ ID NO: 107), 
Enterococcus faecium (AAG02447) (SEQ ID NO: 108), Staphylococcus haemolyticus 
(AAG02432) (SEQ ID NO: 109), Staphylococcus epidermis (AAG02437) (SEQ ID NO:l 10), 
Staphylococcus aureus ( AAG02426) (SEQ ID NO: 1 1 1), Streptomyces sp. CL190 (Q9KWG3) 
(SEQ ID NO: 112), Streptomyces griseolosporeus (Q9KWF7) (SEQ ID NO: 113) and Borrelia 
burgdorferi (051630) (SEQ ID NO:l 14) (Pmk). EMBL/GenBank/DDBJ database accession 
numbers are given after each organism's name in parentheses. 

There is much less homology among the bacterial kinases than among the bacterial ortho- 
logs of the other enzymes of the mevalonate pathway. The mevalonate kinase from Para- 
coccus sp. strain Rl 14 (SEQ ID NO:49) has a 37 amino acid insert in the amino-terminal 
region, which is lacking in other mevalonate kinases. Together with the bacterial Mvk's 
some archaeal enzymes, e.g. from Archaeoglobus fulgidus 9 Methanobacterium thermoauto- 
trophicum and Pyrococcus abyssi, are among the best homologues to the Mvk from Para- 
coccus sp. strain Rl 14. The homology among bacterial phosphomevalonate kinases is even 
weaker than the homology among the bacterial mevalonate kinases. The proteins with the 
best homologies to the Pmk from Paracoccus sp. strain R114 (SEQ ID NO:51) are Mvk's 
from archaea, e.g. Aeropyrum pernix y Pyrococcus horikoshiU M. thermoautotrophicum, P. 
abyssi and A. fulgidus. Since no Pmk's are found in archaea (Smit and Mushegian, supra), 
this suggests that the same kinase might perform both phosphorylations. 

Example 5: Over-expression of the Mevalonate Pathway Genes and the idi Gene from 

Paracoccus sp. strain Rl 14 in £ colt 
Cloning and expression of the mevalonate op eron in E. coli. A X clone, designated clone 
16, from the Paracoccus sp. strain Rl 14 X library (see Example 4) was used as a template for 
PCR amplification of the entire mevalonate operon. Primers Mevop-2020 and Mevop- 
9027 (Table 16) were used for PCR. 

Table 16. Primers used for amplification of mevalonate operon from Paracoccus sp. 
strain R114. 



Primer 


Sequence (5'— »3') 


Mevop-2020 


GGGCAAGCTTGTCCACGGCACGACCAAGCA (SEQ ID NO: 115) 


Mevop-9027 


CGTAATCCGCGGCCGCGTTTCCAGCGCGTC (SEQ ID NO: 11 6) 



30 
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The resulting PCR product was cloned in TOPO-XL (Invitrogen, Carlsbad, CA, USA), 
resulting in plasmid TOPO-XL-mev-opl6. The insert carrying the mevalonate operon was 
excised with HmdIII and Sad and cloned in the HmdIII-SacI cut vector pBBRlMCS2 
[Kovach et aL, Gene 166:175-176 (1995)], resulting in plasmid pBBR-K-mev-opl6. 
5 Plasmid pBBR-K-mev-opl6 was used to transform electroporation-competent E. coli 
strain TGI [Stratagene, La Jolla, CA; Sambrook et al., In: Nolan, C (ed.), Molecular 
Cloning: A Laboratory Manual (Second Edition), p. A.12 (1989)]. Two representative 
positive transformants (E. coli TGI/ pBBR-K-mev-opl6-l and E. a?/* TGI/ pBBR-K-mev- 
opl6-2) were grown in Luria Broth (LB, GibcoBRL, Life Technologies) containing 50 mg/1 

10 kanamycin and tested for HMG-CoA reductase activity (encoded by the Paracoccus sp. 
strain Rl 14 mvaA gene) using the methods described in Example 1. R coli does not 
possess a gene coding for the enzyme HMG-CoA reductase, hence the lack of detectable s 
activity. The crude extracts of both representative transformants of E. coli TGI/ pBBR-K- 
mev-opl6 had easily measurable HMG-CoA reductase activity, demonstrating the 

15 heterologous expression of the cloned mvaA gene. 



Table 17. HMG-CoA reductase activity in crude extracts of E. coli TGI cells carrying the 
cloned mevalonate gene cluster from Paracoccus sp. strain Rl 14. 



Strain 


HMG-CoA reductase activity (U/mg) 


E coli TG 1 


Not detected 3 


E. a>/iTGl/pBBR-K-mev-opl6-l 


0.25 


E. coli TGI/ pBBR-K-mev-opl6-2 


0.78 



a Less than 0.03 U/mg 



Cloning and expression of the idi gene and the individual mevalonate pathway genes from C . 

20 Paracoccus sp. strain Rl 14 in E. coli. The coding regions of the mevalonate operon genes 
from Paracoccus sp. strain Rl 14 were amplified by PCR using the primers shown in Table 
18. The primers were designed such that the ATG start codons constituted the second half 
of an Ndel site (cleavage recognition site CATATG), and BamHI sites (GGATCC) were 
introduced immediately after the stop codons. All PCR products were cloned in the 

25 pCR 2. 1-TOPO vector. The names of the resulting vectors are listed in Table 19. Except 
for the mevalonate kinase gene, all genes contained restriction sites for BamHI, Ndel or 
EcoRl 9 which had to be eliminated in order to facilitate later cloning steps. The sites were 
eliminated by introducing silent mutations using the QuikChange™ site-directed muta- 
genesis kit (Stratagene, La Jolla, CA, USA) and the oligonucleotides shown in Table 20. 

30 The mutagenized coding regions were excised from the TOPO-plasmids with BamHI and 
Ndel and ligated with the BamHl-Ndel cleaved expression vectors pDS-His and pDS. 
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These expression vectors were derived from pDSNdeHis, which is described in Example 2 
of EP 821,063. The plasmid pDS-His was constructed from pDSNdeHis by deleting a 857 
bp Nhel and Xbal fragment carrying a silent chloramphenicol acetyltransferase gene. The 
plasmid pDS was constructed from pDS-His by replacing a small EcoRI-BamHI fragment 
with the annealed primers S/D-l (5* AATTAAAGGAGGGTTTCATATGAATTCG) (SEQ 
ID NO: 11 7) and S/D-2 (5 J GATCCGAATTCATATGAAACCCTCCTTT) (SEQ ID 
NO:118). 

Table 18: Oligonucleotides for the cloning of the mevalonate operon genes. 



Gene 



mvaA 



idi 



hsc 



mvk 



pmk 



mvd 



Forward primer 



Name 



MvaA-Nde 



Idi-Nde 



Hcs-Nde 



Mvk-Nde 1 



Pmk-Nde 



Mvd-Nde 



Sequence (5* -3') 



AAGGC C TC AT ATGATTTC 
CCATACCCCGGT 
(SEQ ID N0:119) 



AAGGCCTCATATGACCGA 
CAGCAAGGATCA 
(SEQ ID NO: 121) 



AAGGCC TC ATATGAAAGT 
GCCTAAGATGA 
(SEQ ID NO: 123) 



AAGGCCTCATATGAGCAC 
CGGCAGGCCTGAAGCA 
(SEQ ID N0:125) 



AAGGC CTC ATATGGATC A 
GGTCATCCGCGC 
(SEQ ID NO: 127) 



AAGGCCTCATATGACTGA 
TGCCGTCCGCGA 
(SEQ ID N0:129) 



Reverse primer 



Name 



MvaA-Bam 



Idi-Bam 



Hcs-Bam 



Mvk-Bam' 



Pmk-Bam 



Mvd-Bam 



Sequence (5 y -3 y ) 



CGGGATCCTCATCGCTCCAT 
CTCCATGT 
(SEQ ID NO: 120) 



C GGG ATCC TC ATTGACGGAT 
AAGCGAGG 
(SEQ ID NO: 122) 



CGGGATCCTCAGGCCTGCCG 
GTCGACAT 
(SEQ ID NO: 124) 



CGGGATCCTCATCCCTGCCC 
CGGCAGCGGTT 
(SEQ ID NO: 126) 



CGGGATCCTCAGTCATCGAA 
AACAAGTC 
(SEQ ID NO: 128) 



CGGGATCCTCAACGCCCCTC 
GAACGGCG 
(SEQ ID NO: 130) 



'The second codon TCA was changed to AGC (silent mutation - both codons encode 
10 serine). 

^he last codon GGC was changed to GGA (silent mutation - both codons encode glycine) 
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Table 19: Names of expression plasmids and construction intermediates. 



Gene 


PCR fragments in 
pCR*2.1-TOPO 


After first 
mutagenesis 


After 2 nd 
mutagenesis 


Genes in 
pDS 


Genes in pDS-His 


mvaA 


TOPO-mvM-BB 


TOPO- 
mvaA-B 


TOFO-mvaA 


pDS-mvaA 


pDS-His -mvaA 


idi 


TOPO-ORFX-B 


TOPO-idi 


n/a 


pDS- idi 


pDS-His -idi 


hsc 


TOPO-/ic5-EN 


TOPO-fccs-N 


TOPO-/ics 


pDS-hcs 


pDS-His-hcs 


mvk 


TOPO-mvfc 


n/a 


n/a 


pDS-mvk 


pDS-His -mvk 


pmk 


YOPO-pmJS>B 


TOPO-pmA: 


n/a 


Nd 


pDS-His -pmk 


mvd 


TOFO-mvd-B 


TOPO-mvd 


n/a . 


pDS-mvd 


pDS-His -mvd 



n/a: not applicable; nd: not done 



Table 20: Oligonucleotides for site-directed mutagenesis. 



Gene/ 
Site 


Forward prime 
Name 


r 

Sequence (S'S*) 


Reverse prime 
Name 


r 

Sequence (5 y ~3*) 


mvaA/ 
BamHI-l 


Mva-Blup 


CCGGCATTCGGGCGGC 
ATCCAGGTCTCGCTG 
(SEQ ID NO: 131) 


Mva-Bldown 


CAGCGAGACCTGGATG 
CCGCCCGAATGCCGG 
(SEQ ID NO: 132) 


mvaA/ 
BamHI-2 


Mva-B2up 


CGTGCAGGGCTGGATT 
CTGTCGGAATACCCG 
(SEQ ID NO: 133) 


Mva-B2down 


CGGGTATTCCGACAGA 
ATC C AGCCC TGC AC G 
(SEQ ID NO: 134) 


idi/ 

BamHl 


Idi~Bup2 


GGGCTGCGCGCCGGCA 
TCCGGCATTTCGACG 
(SEQ ID NO: 135) 


Idi-Bdown2 


CGTCGAAATGCCGGAT 
GCCGGCGCGCAGCCC 
(SEQ ID NO: 136) 


hcsl 
EcoRl 


Hcs-Eup 


GGGTGCGACGGGCGAG 
TTCTTCGATGCGCGG 
(SEQ ID NO: 137) 


Hcs-Edown 


CCGCGCATCGAAGAAC J 
TCGCCCGTCGCACCC 
(SEQ ID NO:138) 


hcsl 
Ndel 


Hcs-Nup-c 


CACGCCCGTCACATAC 
GACGAATACGTTGCC 
(SEQ ID NO: 139) 


Hcs-Ndown- 
c 


GGCAACGTATTCGTCG 
TATGTGACGGGCGTG 
(SEQ ID NO: 140) 


pmk/ 
BamHl 


Pmk-Bup 


GAGGCTCGGGCTTGGC 
TCCTCGGCGGCGGTG 
(SEQ ID NO: 141) 


Pmk-Bdown 


CACCGCCGCCGAGGAG 
CCAAGCCCGAGCCTC 
(SEQ ID NO:142) 


mvd/ 
BamHl 


Mvd-Bup 


CGGC AC GCTGCTGGAC 
CCGGGCGACGCCTTC 
(SEQ ID NO: 143) 


Mvd-Bdown 


GAAGGCGTCGCCCGGG 
TCCAGCAGCGTGCCG 
(SEQ ID NO: 144) 



5 E. coli strain M15 [Villarejo and Zabin, J. Bacteriol. 120:466-474 (1974)] carrying the lad 
(lac repressor)-containing plasmid pREP4 (EMBL/GenBank accession number A25856) 
was transformed with the ligation mixtures and recombinant cells were selected for by 
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growth on LB- Agar plates supplemented with 100 mg/L ampicillin and 25 mg/L kana- 
mycin. Positive clones containing the correct mevalonate operon gene insert were verified 
by PCR. 

For expression of the inserted genes, each of the E. coli strains were grown overnight at 
5 37°C in LB medium containing 25 mg/L kanamycin and 100 mg/L ampicillin. The next 
day, 25 ml of fresh medium was inoculated with 0.5 ml of the overnight cultures and the 
new cultures were grown at 37°C. When the ODeoo of the cultures reached 0.4, expression 

of the cloned genes was induced by addition of isopropyl-P-D-thiogalactopyranoside 
(IPTG) to a final concentration of 1 mM, and the incubation of the cultures (with shaking) 
10 was continued for four hours, after which the cells were collected by centrifugation. 

Crude extract preparation, HMG-CoA reductase assays, and IPP isomerase assays were 
performed as described in Example 1. Tables 21 and 22 show the HMG-CoA reductase 
and IPP isomerase activities, respectively, in the recombinant E. coli strains. Upon IPTG 
induction, strains M15/pDS-mvaA and M15/pDS-zdi contained high levels of the HMG- 
15 CoA reductase and IPP isomerase activity, respectively. This illustrates the ability to over- 
express the mevalonate pathway genes (and overproduce their cognate gene products in an 
active form) from Paracoccus sp. strain Rl 14 in E. coli. 



Table 21. Induction of HMG-CoA reductase activity in E. coli strains over-expressing the 
cloned mvaA gene from Paracoccus sp. strain Rl 14. 



Strain/plasmid 


IPTG Induction 


HMG-CoA reductase activity (U/mg) 


M15/pDS-mvaA 




8.34 


M15/pDS-mvaA 


+ 


90.0 


M15/pDS-His- mvaA 




1.74 


M15/pDS-His-mvaA 


+ 


2.95 


MlS/pDS-rrmf 




0.05 


a M15/pDS-mvd was included as a negative contro 
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Table 22. Induction of IPP isomerase activity in E. coli strains over- expressing the cloned 
idi gene from Paracoccus sp. strain Rl 14. 



Strain/plasmid 


IPTG Induction 


IPP isomerase activity (U/mg) 


M15/pDS-i<i* 




Not detected 6 


M15/pDS-u« 




22.0 


M15/pDS-His-ufc 




Not detected 


M15/pDS-His-z<*i 


+ 


Not detected 


MIS/pDS-mvd 3 




Not detected 



a M15/pDS-mvd was included as a negative control 
b <l U/mg 



5 The crude extracts used for the enzyme assays were analyzed by sodium dodecylsulfate- ( ] 
polyacrylamide gel electrophoresis (SDS-PAGE). For strains E. coli M15/pDS-mraA and R 
coli M15/pDS-His-mraA, the presence or absence of a highly expressed protein of the 
expected molecular mass (36.3 kD) correlated with the HMG-CoA reductase activity 
measured in the extracts (Table 21). The absence of the His-tagged protein could be 

10 explained by reduced expression at the level of transcription or translation by instability of 
the mRNA or the protein. The crude extracts of R coli Ml5/pDS-uf/ and E. coli M15/pDS- 
His-idi both showed highly expressed proteins of the expected molecular masses of 37.3 
kD and 39.0 kD, respectively. However, only the extract from E. coli M15/pDs-u2i had 
increased IPP isomerase activity (Table 22), indicating that the histidine-tagged form of 

15 the enzyme was not functional under these conditions. 

By SDS-PAGE analysis of crude extracts of £. coli strains over- expressing the other four / 
genes of the Paracoccus sp. strain Rl 14 mevalonate operon (hcs, pmk> mvk, and mvd , refer 
to Table 19) high expression of the native form of the enzyme was not detected upon IPTG 
induction, although some expression cannot be ruled out. On the other hand, high 
20 expression was observed with the His-tagged form of all four proteins. 

Example 6: Improved Zeaxanthin Production in Paracoccus sp. strain Rl 14 by Over- 
Expression of the crtE Gene 
Construction of pBBR-K-Zea4, pBBR-K-Zea4-up and pBBR-K-Zea4-down and effects of 
these plasmids on zeaxanthin production in Paracoccus sp. strain R114 . The carotenoid 
25 (crt) gene cluster of Paracoccus sp. strain R1534 was excised from plasmid pZea-4 [Pasa- 
montes et al., Gene 185:35-41 (1997)] as an 8.3 kb BamHI - EcoRI fragment. This frag- 
ment containing the crt gene cluster was ligated into the BamHl and £a>RI-cut vector 
pBBRlMCS-2 (GenBank accession #U23751) resulting in pBBR-K-Zea4. Plasmid pBBR- 
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K-Zea4 was introduced into Paracoccus sp. strain Rl 14 by conjugation to test for improved 
zeaxanthin production. The control strain Rl 14 and two independent isolates of strain 
Rll4/pBBR-K-Zea4 were tested for zeaxanthin production in shake flask cultures (using 
medium 362F/2, see Example 11). The data in Table 23 show that both recombinant 
5 strains carrying plasmid pBBR-K-Zea4 produced significantly higher levels of zeaxanthin 
than Rl 14, and had higher specific rates of production (mg zeaxanthin/OD66o). This 
suggested that one or more of the genes within the cloned insert in pBBR-K-Zea4 encoded 
an enzyme(s) that was limiting for zeaxanthin production in Paracoccus sp. strain R114. 



Table 23. Zeaxanthin production by strains Rl 14 and Rl 14/pBBR-K-Zea4. 





24 Hours 


48 Hours 


72 Hours 


Strain 


ZXN a 
(mg/1) 


Spec. 
Form. b 


ZXN 
(mg/1) 


Spec. 
Form. 


ZXN 
(mg/1) 


Spec. 
Form. 


R114 


54.5 


2.2 


81.7 


4.1 


78.1 


4.5 


R114/pBBR-K-Zea4 
(clone 4) 


41.0 


3.0 


100.7 


5.2 


97.6 


6.2 


R114/pBBR-K-Zea4 
(clone 5) 


41.1 


3.1 


110.5 


5.7 


102.1 


6.5 



10 a Zeaxanthin 

b Specific Formation (mg ZXN/I/OD660) 

To localize the positive effect, two plasmid derivatives were created that contained sub- 
cloned regions of the cloned insert present in pBBR-K-Zea4. The "upstream" region of the 
pBBR-K-Zea4 insert, comprising ORF 5 and the genes atoB and crtE, (Pasamontes et al., 



15 supra) is flanked by unique sites for the restriction enzymes Xbal and Avrll. Plasmid 

pBBR-K-Zea4-down was constructed by digesting pBBR-K-Zea4 with these two enzymes 
and deleting the "upstream" region. Analogously, plasmid pBBR-K-Zea4-up was con- 
structed by deletion of the "downstream" region within the cloned insert in pBBR-K-Zea4, 
using the restriction enzymes EcoRV and Stul. The two new plasmids were transferred to 

20 Paracoccus sp. strain Rl 14 by conjugation. Zeaxanthin production was compared (shake 
flask cultures, same conditions as described above) in strains Rl 14 (host control), 
R114/pBBR-K (empty vector control), R114/ pBBR-K-Zea4-down and Rll4/pBBR-K- 
Zea4-up (Table 24). The data clearly showed that the positive effect on zeaxanthin pro- 
duction was a result of the presence in multiple copies of the cloned segment containing 

25 ORF5, atoB and crtE , i.e., the insert present in plasmid pBBR-K-Zea4-up. A series of 
deletion plasmids was constructed from pBBR-K-Zea4-up. By introducing each of these 
plasmids into strain R114 and testing for zeaxanthin production, it was determined that it 
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was over-expression of the crtE gene that was providing the improved zeaxanthin pro- 
duction in strains Rl 14/pBBR-K-Zea4 and pBBR-K-Zea4-up. This result is consistent 
with the activity of GGPP synthase (encoded by crtE) being limiting for zeaxanthin 
production in Paracoccus sp. strain Rl 14. Using the methods described in Example 1, 
5 crude extract of strain Rl 14/pBBR-K-Zea4-up was found to have 2.6- fold higher GGPP 
synthase activity than R114. To prove this directly, a new plasmid allowing over- 
expression of only the crtE gene was constructed as described in the following two sections. 



Table 24. Zeaxanthin production by strains carrying deletion derivatives of plasmid 
pBBR-K-Zea4. 





24 Hours 


48 Hours 


72 Hours 


Strain 


ZXN a 
(mg/1) 


Spec. 
Form. 


ZXN 
(mg/1) 


Spec. 
Form. 


ZXN 
(mg/1) 


Spec. / 
Form. 


R114 


35.0 


1.2 


75.7 


4.1 


73.9 


4.4 


Rl 14/pBBR-K 


32.0 


1.5 


59.3 


3.1 


63.3 


3.9 


Rl 14/pBBR-K-Zea4-up 


51.5 


2.2 


98.8 


5.5 


85.5 


5.7 


Rl 14/pBBR-K-Zea4-down 


41.6 


1.8 


63.4 


3.3 


66.4 


3.9 



10 a Zeaxanthin 



Specific Formation (mg ZXN/l/OD 6 6o) 

Construction of the expression vectors pBBR-K-PcrfE and pBBR-tK-PmE. The vector 
pBBRlMCS-2 was cut with BstXl and Bsu36l and the larger fragment was ligated with the 
annealed oligonucleotides MCS-2 up 

15 (5' TCAGAATTCGGTACCATATGAAGCTTGGATCCGGGG 3') (SEQ ID NO:145) and 
MCS-2 down (5* GGATCCAAGCTTCATATGGTACCGAATTC 3') (SEQ ID NO: 146), 1 
resulting in vector pBBR-K-Nde. The 270 bp region upstream of the crtE gene in the caro- 
tenoid gene cluster from Paracoccus sp. strain R114, which contains the putative crtE pro- 
moter {PcrtE ) including the ribosome binding site and the crtE start codon (Pasamontes et 

20 al., supra) was amplified from Paracoccus sp. strain Rl 14 DNA by PCR with primers crtE- 
up (5' GGAATTCGCTGCTGAACGCGATGGCG 3') (SEQ ID NO:147) and crtE-down (5' 
GGGGTACCATATGTGCCTTCGTTGCGTCAGTC 3') (SEQ ID NO:148). The PCR 
product was cut with EcoRl and Ndel and inserted into the EcoRl-Ndel cut backbone of 
pBBR-K-Nde, yielding plasmid pBBR-K-Pcrt£. An Ndel site, which contains the ATG 

25 start codon of crtE, was included in primer crtE-down. Hence, any introduced coding 
region with the start codon embedded in a Ndel site should be expressed using the 
ribosomal binding site of crtE. The plasmid pBBR-K-PatE was cut with BamHl and the 
annealed oligonucleotides pha-t-up 
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(5' GATCCGGCGTGTGCGCAATTTAATTGCGCACACGCCCCCTGCGTTTAAAC 3') 
(SEQ ID NO: 149) and pha-t-down 

(5' GATCGTTTAAACGCAGGGGGCGTGTGCGCAATTAAATTGCGCACACGCCG 3') 
(SEQ ID NO: 150) were inserted. The insertion was verified by sequencing, and the version 
5 of the plasmid having the oligos inserted in the orientation that reconstitutes the BamHl 
site closer to the PcrtE promoter was named pBBR-tK-Pcrf£. The inserted sequence carries 
the putative transcriptional terminator found between the Paracoccus sp. strain R114 phaA 
and phaB genes (see Example 10) and should, therefore, ensure proper termination of the 
transcripts initiated from the VcrtE promoter. 

Construction of plasmid pBBR-K-P crtE-crtE-3. To construct a multi-copy plasmid for 
increased expression of the crtE gene in the Paracoccus sp. strain Rl 14 host, the crtE gene 
was amplified from plasmid p59-2 (Pasamontes et al., supra) by PCR using the primers 
crtE-Nde (5* AAGGCCTCATATGACGCCCAAGCAGCAATT 3') (SEQ ID NO:151) and 
crtE-Bam (5' CGGGATCCTAGGCGCTGCGGCGGATG 3') (SEQ ID NO:152). The 
amplified fragment was cloned in the pCR^.l-TOPO vector, resulting in plasmid TOPO- 
crtE. The Ndel-BamHl fragment from TOPO-crf£ was subcloned in Ndel-BamHI- 
digested plasmid pBBR-K-Pcrf£ , yielding pBBR-K-PcrtE-cr£E. Finally, pBBR-K-Pcrf£- 
crtE-3 was constructed by replacing the smaller BgUl fragment from pBBR-K-Pcrt£-at£ 
with the smaller BgR\ fragment from pBBR-K-Zea4-up. Plasmid pBBR-K-Pcrt£-crf£-3 
was transferred to Paracoccus sp. strain Rl 14 by electroporation. Using the methods 
described in Example 1, GGPP synthase activity in crude extracts was found to be 2.9- fold 
higher in strain R114/pBBR-K-Pa*£-crf£-3 than in strain R114. This degree of elevated 
activity was similar to that observed in Rl 14/pBBR-K-Zea4-up. Table 25 shows the 
zeaxanthin production by strain R114/pBBR-K-Pa*£ -crtE -3 was essentially identical to 
strain R114/pBBR-K-Zea4-up. 



10 



15 



20 



c 



25 
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Table 25. Comparison of zeaxanthin production by strains Rl 14/pBBR-K-PcrfE -crtE -3 
and Rl 14/pBBR-K-Zea4-up. 





24 Hours 


48 Hours 


72 Hour; 


5 


Strain 


ZXN a 
(mg/D 


Spec. 
Form. 


ZXN 
(mg/1) 


Spec. 
Form. 


ZXN 
(mg/1) 


Spec. 
Form. 


R114 


49.0 


1.6 


83.9 


3.3 


97.8 


4.3 


R114/pBBR-K 


42.6 


1.8 


73.7 


3.8 


88.8 


4.9 


Rl 14/pBBR-K-Pcrf£ -crtE -3 


64.6 


2.9 


127.0 


5.8 


165.6 


8.5 


Rl 14/pBBR-K-Zea4-up 


64.7 


2.9 


118.0 


5.9 


158.0 


10.1 



'Specific Formation (mg ZXN/l/OD 66 o) 



5 Example 7: Expression of Individual Genes of the Paracoccus sp. strain Rl 1 4 Mevalonate 

Operon in the Native Host, Paracoccus sp. strain Rl 14 
Fv prpssion of individual cloned ?enes of the Paracoccu s sp. strain Rl 14 mevalonate operon 
in the Paracoccus sp. strain Rl 14 host . The mutagenized coding regions of the mevalonate 
operon genes in TOPO-plasmids (see Example 5) were excised with BamHI and Ndel and 

10 ligated with the BamHl-Ndel cleaved vector pBBR-tK-PcrtE (see Example 6). The result- 
ing plasmids P BBR-tK-Pcrr£- mvoA, pBBR-tK-Pcrfij-ufr, pBBR-tK-PcrtE-fccs, pBBR-tK- 
PcrtE-mvk, pBBR-tK-Pcrf£-pmfc and pBBR-tK-Pcrt£-mvd were introduced into Paracoccus 
sp. strain Rl 14 by electroporation. Transformants were selected on agar medium con- 
taining 50 mg/1 kanamycin and verified by PCR. 

15 To illustrate that the plasmid-borne mevalonate pathway genes can be expressed in the 
native host Paracoccus sp. strain Rl 14, HMG-CoA reductase activity was compared in C 
crude extracts of strains R114/pBBR-K (control) and R114/pBBR-tK-Pcrt£-mvaA 
(methods used are set forth in Example 1). The specific activities of HMG-CoA reductase 
in strains Rl 14/pBBR-K and Rl 14/pBBR-tK-Pcrf£-mvaA were 2.37 U/mg and 6.0 U/mg, 

20 respectively. Thus the presence of the mvaA gene on a multicopy plasmid (and expressed 
from the PcrtE promoter) resulted in a 2.5- fold increase in HMG-CoA reductase activity 
relative to the basal (i.e., chromosomally encoded) activity of Rl 14 carrying the empty 
vector pBBR-K. 
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Example 8: Construction of "Mini-Operons" for Simultaneous Over- Expression the 

Cloned Genes of the Mevalonate Pathway with the Paracoccus sp. strain 
Rl 14 crtE Gene 

Plasmid constructions . As was shown in Example 6, introduction of plasmid pBBR-K- 
5 PcrtE-crtE-3 into Paracoccus sp. strain Rl 14 resulted in increased production of 
zeaxanthin, indicating that GGPP synthase activity was rate limiting for zeaxanthin 
biosynthesis in strain Rl 14. Example 7 further showed that genes coding for the enzymes 
of the mevalonate pathway could be over- expressed in the native host Paracoccus sp. strain 
Rl 14, and resulted in increased activity of the encoded enzyme. However, none of the 

10 recombinant strains of Paracoccus sp. strain Rl 14 that carried plasmids containing each 
individual gene of the mevalonate operon showed increased zeaxanthin production 
compared to strain Rl 14. It is possible that the benefit of over- expression of the genes of 
the mevalonate operon in Paracoccus sp. strain R114 could be masked by the downstream 
"bottleneck" in the zeaxanthin pathway (GGPP synthase). Creation of plasmids that allow 

15 simultaneous over-expression of each mevalonate pathway gene (or perhaps combinations 
of these genes) together with crtE could relieve all rate limitations in the overall zeaxanthin 
biosynthetic pathway, thereby improving zeaxanthin production. The next section 
describes the construction of "mini-operons" designed to allow co-over-expression of crtE 
and each of the genes coding for the five enzymes of the mevalonate pathway. 

20 The crtE, mvaA> idi and mvk genes were excised from the respective TOPO-plasmids 
(described in Examples 5 and 6) with BamHl and Ndel and ligated with BamH\-Nde\- 
cleaved vector pOCV-1 (described in Example 12). The crtE gene does not have an 
adenine as the last nucleotide of the coding region, and in addition, has a TAG rather than 
a TGA stop codon and an unsuitable distance between the stop codon and the BamHl site. 

25 Therefore, the end of crtE does not meet the requirements of the operon construction 
vectors (refer to Example 12) and crtE must be the last gene in any operon constructed 
with pOCV-l-at£. To meet the requirement for an adenine as the first nucleotide of the 
second codon and the last nucleotide of the last codon, mutations had to be introduced in 
three genes of the mevalonate operon. The second codon of pmk, GAT, encoding Asp, was 

30 changed into AAT, encoding Asn. The last codon of mvd ends with a T and the last 
codons of pmk and hcs end with C. Changing these nucleotides to A results in silent 
mutations except for pmk where the last amino acid is changed from Asp to Glu. 
Oligonucleotides were designed to introduce the necessary changes by PCR. The 
sequences of the oligonucleotides and the templates used for those PCR reactions are 

35 shown in Table 26. All PCR products were cloned in the pCR°2.1-TOPO vector, resulting 
in plasmids TOPO-mv<i ocv , TOPO-/>m/fc ocv and TOPO-/ics ocv The inserts were excised 
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with Ndel and BamHl and ligated with the Ndel-BamHl cut backbone of pOCV-2 (see 
Example 12). The final cloning steps to assemble each of the "mini-operons" were 
analogous, and are illustrated by the representative scheme for construction of pBBR-K- 
PcrtE- mvaA - crtE- 3 . 



5 Table 26: Oligonucleotides and templates used for PCR in the construction of plasmids 
TOPO-mvd 00 ', TOPO-pm* ocv and TOPO-/ics ocv . 



Gene 


Forward 
Name 


primer 

Sequence (5*-3 y ) 


Reverse pr 
Name 


imer 

Sequence (5'-3') 


Template 


Hcs 


Hcs- 
Nde 


AAGGC C TC AT ATG AAA 
GTGCCTAAGATGA 
(SEQ ID NO: 123) 


Hcs- 
mut3 


CCGGATCCTCATGCC 
TGCCGGTCGACATAG 
(SEQ ID NO: 153) 


pBBR-tK-Pcrf£- 
hcs 


Pmk 


Pmk- 
mut5 


GAAGGCACATATGAAT 
CAGGTCATCCGCGC 
( SEQ ID NO: 154) 


Pmk- 
mut3 


GCCGGATCCTCATTC 
ATCGAAAACAAGTCC 
(SEQ ID NO: 155) 


pBBR-tK-Pcrr/ v 
pmk 


Mvd 


Mvd- 
Nde 


AAGGCCTCATATGACT 
GATGCCGTCCGCGA 
(SEQ ID NO: 129) 


Mvd- 
mut3 


ACGCCGGATCCTCAT 
CGCCCCTCGAACGGC 
(SEQ ID NO:156) 


pBBR-tK-Pcrt£- 
mvd 



Example 9: Cloning and Sequencing of the ispA Gene Encoding FPP Synthase from 

Paracoccus sp. strain Rl 14 

10 Because FPP synthase lies in the central pathway for zeaxanthin biosynthesis in Paracoccus 
sp. strain Rl 14, increasing the activity of this enzyme by increasing the dosage of the ispA 
gene has the potential to improve zeaxanthin production. For this reason, the ispA gene 
from Paracoccus sp. strain Rl 14 was cloned and sequenced as follows. The amino acid 
sequences of six bacterial FPP synthases were obtained from public databases. These ( 

15 sequences have several highly conserved regions. Two such regions, and the oligonucleo- 
tides used for PCR, are shown in Table 27. PCR with oligonucleotides GTT-1 and GTT-2, 
using Paracoccus sp. strain Rl 14 DNA as template, gave a product of the expected size. 
The PCR product was cloned in the vector pCR*2.1-TOPO and sequenced. The cloned 
fragment was used as a probe for a Southern analysis of Paracoccus sp. strain Rl 14 DNA 

20 and was found to hybridize to a BamHl-Ncol fragment of about 1.9 kb. Paracoccus sp. 
strain Rl 14 DNA was cut with BamHl and Ncol and the fragments were separated by 
agarose gel electrophoresis. The region between 1.5 and 2.1 kb was isolated and cloned in 
the BamHl and Ncol sites of a cloning vector. This partial library was then screened using 
the *spA-PCR fragment as a probe, and two positive clones were isolated. Sequencing con- 

25 firmed that the plasmids of both clones contained the ispA gene. Upstream of ispA (SEQ 
ID NO: 159) is the gene for the small subunit of exonuclease VII, XseB (SEQ ID NO: 158), 
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and downstream is the dxs gene (SEQ ID NO: 160) encoding the l-deoxyxylulose-5-phos- 
phate synthase. This is the same gene arrangement as found in E. coli. The sequence of the 
Ncol-BamHl fragment is illustrated in SEQ ID NO: 157, the amino acid sequences of XseB, 
IspA and Dxs are illustrated in SEQ ID NO:158, SEQ ID NO:159, and SEQ ID NO:160, 
5 respectively. The start codon of ispA may be GTG or ATG resulting in two or one 
methionine residues, respectively, at the amino-terminus of the native IspA. 

Using the same general cloning strategy described in Examples 5-7, a new plasmid, pBBR- 
iK-PcrtE-ispA-2 was constructed to allow for over- expression of the ispA gene in the native 
host Paracoccus sp. strain Rl 14. The plasmid was introduced into strain R114 by electro- 

10 poration, and transformants were confirmed by PCR. Three representative transformants 
and a control strain (R114/pBBR-K) were grown in 362F/2 medium (Example 11), crude 
extracts were prepared and assayed for activity of the ispA gene product, FPP synthase 
according to the methods described in Example 1. The basal (chromosomally-encoded) 
FPP synthase specific activity in Rl 14/pBBR-K was 62.6 U/mg. The FPP synthase activity 

15 in the three transformants was 108.3 U/mg (73% increase), 98.5 U/mg (57% increase) and 
83.8 U/mg (34% increase), demonstrating the over-expression of the ispA gene and over- 
production of its product, FPP synthase, in an active form in Paracoccus sp. strain Rl 14. 



Table 27: Oligonucleotides designed from two conserved bacterial IspA peptides. 



Peptide 1 

Bradyrhizobium japonicum 
Rhizobium sp. strain NGR234 
Bacillus stearothermophilus 
Bacillus subtilis 
Escherichia coli 
Haemophilus influenzae 
Oligonucleotide GTT- 1 (5'-3') 


Val His Asp Asp Leu Pro (SEQ ID NO: 161) 
Val His Asp Asp Leu Pro (SEQ ID NO: 162) 
He His Asp Asp Leu Pro (SEQ ID NO: 163) 
He His Asp Asp Leu Pro (SEQ ID NO: 164) 
He His Asp Asp Leu Pro (SEQ ID NO: 165) 
He His Asp Asp Leu Pro (SEQ ID NO: 166) 
tc cay gay gay ctg cc (SEQ ID NO: 167) 


Peptide 2 

Bradyrhizobium japonicum 
Rhizobium sp. strain NGR234 
Bacillus stearothermophilus 
Bacillus subtilis 
Escherichia coli 
Haemophilus influenzae 
Reverse complement of 
Oligonucleotide GTT-2 (5*-3 9 ) 


Asp Asp He Leu Asp (SEQ ID NO: 168) 
Asp Asp He Leu Asp (SEQ ID NO: 169) 
Asp Asp lie Leu Asp (SEQ ID NO: 170) 
Asp Asp lie Leu Asp (SEQ ID NO:171) 
Asp Asp He Leu Asp (SEQ ID NO: 172) 
Asp Asp He Leu Asp (SEQ ID NO: 173) 

gay gay ate ctg gay (SEQ ID NO: 174) 



02O99095A2_L> 



WO 02/099095 PCT/EP02/06171 



-97- 



Y = C or T 



Example 10: Cloning and Sequencing of the Genes Coding for Acetyl-CoA 

Acetyltransferase from Paracoccus sp. strain Rl 14 
The first committed step in IPP biosynthesis is the condensation of acetyl-CoA and aceto- 

5 acetyl-CoA to hydroxymethylglutaryl-CoA (HMG-CoA) by HMG-CoA synthase. The sub- 
strate acetoacetyl-CoA is formed by the enzyme acetyl-CoA acetyltransferase (also known 
as acetoacetyl-CoA thiolase or B-ketothiolase) by condensation of two molecules of acetyl- 
CoA. Because this reaction links central metabolism (at acetyl-CoA) to isoprenoid biosyn- 
thesis Ana the mevalonate pathway, increasing the activity of acetyl-CoA acetyltransferase 

10 by gene amplification has the potential to increase carbon flow to carotenoids and other 
isoprenoids in vivo. In Paracoccus sp. strain Rl 14, there are at least two genes, atoB and 
phaA, that encode acetyl-CoA acetyltransferases. The end of the atoB gene is 165 nucleo- 
tides upstream of the start of crtE in Paracoccus sp. strains R1534 (US 6,087,152) and R114 
(this work). The nucleotide sequence of the atoB gene and the corresponding amino acid 

15 sequence of the encoded acetyl-CoA acetyltransferase from Paracoccus sp. strain R1534 are 
illustrated in SEQ ID NO:175 and SEQ ID NO:176, respectively. 

Using the same general strategy as described in Example 5, the atoB gene was cloned in 
plasmids pDS and pDS-His. The new plasmids, pDS-atoB and pDS-His-ataB were intro- 
duced into E. colt strain M15. The resulting strains M15/pDS-«foB and M15/pDS-His- 

20 atoB were grown with and without IPTG induction (as described in Example 5), and crude 
extracts were prepared for acetyl-CoA acetyltransferase assays (methods used were 
described in Example 1) and SDS-PAGE analysis. The acetyl-CoA acetyltransferase 
specific activities in extracts of M15/pDS-aroB and M15/pDS-His-afoB (with IPTG C 
induction) were 0.2 U/mg and 13.52 U/mg, respectively. The basal activity measured in E. 

25 coli without the plasmids was 0.006 U/mg. Upon IPTG induction the atoB gene product, 
acetyl-CoA acetyltransferase, is overproduced in E. coli Ml 5. Both the native (M15/pDS- 
atoB) and His-tagged (M15/pDS/his-<3rofi) forms were overproduced. The degree of 
overproduction was much higher in M15/pDS-His-afo£, consistent with the measured 
acetyl-CoA acetyltransferase activity in the (induced) extracts of the two strains. 

30 Acetoacetyl-CoA is also the substrate for the first committed step in poly- 

hydroxyalkanoate (PHA) biosynthesis. In many bacteria the genes involved in PHA 
biosynthesis are grouped in operons [Madison and Huisman, Microbiol. Mol. Biol. Rev., 
63:21-53 (1999)]. In Paracoccus denitrificans the phaA and phaB genes, encoding the 
acetyl-CoA acetyltransferase and acetoacetyl-CoA reductase, respectively, are clustered in 
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an operon [Yabutani et al., FEMS Microbiol. Lett. 133:85-90 (1995)] whereas phaC, the 
gene encoding the last enzyme in the pathway, poly(3-hydroxyalkanoate) synthase, is not 
part of this operon [Ueda et al., J. Bacteriol. 178:774-779 (1995)]. PCR fragments 
containing parts of phaA from Paracoccus sp. strain R1534 and phaC from Paracoccus sp. 
5 strain Rl 14 were obtained using primers based on the P. denitrificarts phaA and phaC gene 
sequences. The PCR fragments were then used as probes to screen a Paracoccus sp. strain 

Rl 14 A-library (see Example 4). Several X-clones hybridizing with the phaA or the phaC 
probes were isolated, and the presence of the phaA or phaC genes in the inserts was 
verified by sequence analysis. Three phaA X-clones were further analyzed by subcloning 

10 and sequencing, whereby the phaB was found downstream of phaA. Therefore, as is the 
case in P. denitrificans, the phaA and phaB genes are clustered whereas the phaC gene is 
located elsewhere in the genome. The nucleotide sequence of the phaAB cluster from 
Paracoccus sp. strain Rl 14 and the deduced amino acid sequences of the acetyl-CoA 
acetyltransferase (PhaA) are illustrated in SEQ ID NO: 177, and SEQ ID NOs:178 and 179, 

15 respectively. The clustering of genes involved in PHA biosynthesis in operons suggests 
that at least phaA and phaB are expressed together when the cell produces poly(3- 
hydroxyalkanoates). On the other hand, a putative transcriptional stop signal is found 
between the phaA and phaB genes from Paracoccus sp. strain Rl 14 which is absent in the P. 
denitrificarts phaAB operon (Yabutani et al., supra). Thus, the expression of the two genes 

20 might not be coupled in Paracoccus sp. strain Rl 14. 

Using the same general strategy as described in Example 5, the phaA gene was cloned in 
plasmid pDS-His. The new plasmid, pDS-His-p/ioA, was introduced into £. coli strain 
M15. The resulting strain Ml5/pDS-His-pfcaA was grown with and without IPTG in- 
duction (as described in Example 5) and crude extracts were prepared for SDS-PAGE 
25 analysis. The cloned His-tagged Paracoccus sp. strain R114 PhaA (acetyl-CoA acetyl- 
transferase) is overproduced upon IPTG induction in the E. coli M15 host. 

The potential benefit of amplifying the atoB or phaA genes, encoding acetyl-Co acetyl- 
transferase, on zeaxanthin production is mentioned above. In addition, it may be benefi- 
cial for zeaxanthin production to decrease or eliminate the activity of actoacetyl-CoA re- 
30 ductase (the phaB gene product) to avoid diversion of some of the acetoacetyl-CoA formed 
in vivo to the PHA pathway. Mutants of Paracoccus sp. strain Rl 14 lacking activity of phaB 
could be obtained by gene replacement techniques (specifically replacing the wild-type 
phaB gene in the chromosome with an inactive form of the gene) or by classical muta- 
genesis and screening. 



0209909 SA2_I_> 



WO 02/099095 PCT/EP02/06171 

c - 

-99- 

Example 1 1 : Model for the Industrial Production of Lycopene Using Mutants Derived 

from Paracoccus sp. strain Rl 14 
Lycopene is a red carotenoid that is an intermediate in the biosynthesis of zeaxanthin in 
the new Paracoccus species represented by strain R-1512 and its mutant derivatives R1534 

5 and Rl 14. As lycopene itself has significant commercial potential, it was of interest to test 
the potential of the new Paracoccus species to produce lycopene by industrial fermentation. 
To obtain mutants blocked in zeaxanthin biosynthesis that accumulated lycopene, 
Paracoccus sp. strain Rl 14 was subjected to mutagenesis with ultraviolet (UV) light 
followed by screening for red colonies. The UV mutagenesis was performed as follows. 

10 An overnight culture of strain Rl 14 was grown in ME medium (see Example 2). The 
overnight culture was subcultured into fresh ME medium (initial OD 6 io = 0.1) and 
incubated at 28°C for 3 hours. Aliquots of this culture were centrifuged and the pellet ^ 
washed with 20mM potassium phosphate buffer (pH 7.2). After a second centrifugation, 
the pellet was resuspended to a final OD 6 io of 0.1. Ten milliliter aliquots of the cell 

15 suspension were placed in a sterile 100-ml glass beaker. The thin layer of cell suspension 
was irradiated with UV light at a flux of 1450|LlW/cm 2 for a pre- determined optimal length 
of time. The cell suspension was mixed during the irradiation by means of a paper clip in 
the beaker and a magnetic stirrer. The mutagenized cell suspensions (and the 
unmutagenized controls) were plated on 362/F2 agar medium (Table 28). Triplicate viable 

20 plate counts (in dim room light) were done on suspensions before and after mutagenesis. 
Plates were incubated for 4-5 days at 28°C, and the colonies were scored. Several red 
colonies (putative lycopene producers) were identified and purified by re-streaking. One 
mutant, designated UV7-1, was further evaluated for lycopene production. 

( 

Table 29 shows the zeaxanthin production and lycopene production by the control strain 
25 Rl 14 and its mutant derivative UV7- 1. Strain Rl 14 produced only zeaxanthin. Mutant 
UV7-1 produced mostly lycopene, but also produced a residual amount of zeaxanthin, 
suggesting that the mutational block in UV7-1 (presumably in the crtY gene) is not 
complete. These results show that it is possible to derive lycopene producing strains from 
Paracoccus sp. strain R114. 
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Table 28. Recipe and preparation for medium 362F/2 



c 



c 



Component 


Amount 


Glucose monohydrate 


33 g 


Yeast extract 


10 g 


Tryptone 


10 g 


NaCl 




MgS0 4 -7H 2 0 


2.5 g 


Agar (for solid medium) 


20 g 


Distilled water 


To 932 ml 


-adjust pH to 7.4 


-sterilize by filtration (liquid medium) or autoclaving (solid medium) 


-Add 2.5 ml each of microelements solution, NKP solution and CaFe solution 


Microelements solution 


Amount per liter distilled water 


(NH 4 ) 2 Fe(S04) 2 -6H 2 0 


80 g 


ZnS0 4 -7H 2 0 


6g 


MnS0 4 -H 2 0 


2g 


NiS0 4 -6H 2 0 


0.2 g 


EDTA 


6g 


-sterilize by filtration 


NKP solution 


Amount per liter distilled water 


K 2 HP0 4 


250 g 


(NH 4 ) 2 HP0 4 


300 g 


-sterilize by filtration 


CaFe solution 


Amount per liter distilled water 


CaCl 2 -2H 2 0 


75 g 


FeCl 3 -6H 2 0 


5g 


Concentrated HC1 


3.75 ml 


-sterilize by filtration 
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Table 29. Zeaxanthin and lycopene production by Paracoccus sp. strain Rl 14 and its red 
mutant derivative UV7- 1 . 





Zeaxanthin (mg/1) 


Lycopene (mg/1) 


24 hours 






R114 


36.65 


0 


UV7-1 


3.85 


20.85 


48 hours 






R114 


72.95 


0 


UV7-1 


5.75 


85.95 


72 hours 






R114 


83.9 


0 


UV7-1 


5.85 


124.55 ( 

1 



Example 12: Model for the Industrial Production of Astaxanthin by Fermentation Using 
5 Strains Derived from Paracoccus sp. strain Rl 14 

Astaxanthin is a commercially important carotenoid used primarily in the aquaculture 
industry. EP 872,554 showed that astaxanthin production could be achieved in E. coli by 
introducing plasmids containing combinations of the cloned carotenoid {art) genes from 
Paracoccus sp. strain Rl 534 and Paracoccus carotinifaciens E-396 . Together, the cloned crt 

10 genes (crtEBIYZ) and crtW ({3-carotene (3-4 oxygenase) encoded a total biosynthetic path- 
way from FPP through zeaxanthin to astaxanthin. The sequences of the P. carotinifaciens 
E-396 crtW y Paracoccus sp. R1534 crtZ y and Paracoccus sp. R1534 crtE genes and encoded 
polypeptides are set forth in (SEQ ID NOs:180 and 181 (crtW); 182 and 184 (crtZ); and 
184 and 185 {crtE)) However, it was not shown that astaxanthin production could be ( 

15 achieved in the Paracoccus sp. strain Rl 14 host family. To demonstrate the utility of 

recombinant strains derived from strain R114 for astaxanthin production, the cloned crtW 
gene (SEQ ID NO:180) was introduced into strain R114 as follows. 



Table 30. PCR primers used for the work described in Example 12. 



Primer name 


Sequence 


CrtW-Nde 


5' AAGGCCTCATATGAGCGCACATGCCCTGCC 3' (SEQ ID NO.186) 


CrtW- Bam 


5' CGGGATCCTCATGCGGTGTCCCCCTTGG 3' (SEQ ID NO:187) 


CrtZ-Nde 


5' AAGGCCTCATATGAGCACTTGGGCCGCAAT 3* (SEQ ID NO: 188) 


CrtZ- Bam 


5' AGGATCCTCATGTATTGCGATCCGCCCCTT 3' (SEQ ID NO: 189) 
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The crt W gene was amplified by PCR from the cloned art cluster of Paracoccus carotini- 
faciens strain E-396 T (Tsubokura et ah, supra; EP 872,554) using the primers crtW-Nde 
and crtW-Bam (Table 30). The primers were designed such that the ATG start codon con- 
stitutes the second half of an Ndel site (cleavage recognition site CAT ATG), and a BamHl 
5 site (GGATCC) was introduced immediately after the stop codon. The PCR product was 
cloned in the pCR*2.1-TOPO vector, resulting in plasmid TOPO-a*W. The crt W gene was 
excised with Ndel and BamHl and subcloned in the Ndel-BamHl cut vector pBBR-K-Pcrt£ 
(described in Example 6) to create plasmid pBBR-K-PcrtE-crtW. 

Plasmid pBBR-K-Pcrt£-crrVV r was transferred to Paracoccus sp. strain R114 using a standard 
10 bacterial conjugation procedure {£. coli strain S17 [Priefer et al., J. Bacteriol. 163:324-330 
(1985)] was the donor organism}. Transconjugants were selected on medium 362F/2 agar 
(Table 28) containing 50 mg/1 kanamycin and purified by restreaking on the same 
medium. The presence of plasmid pBBR-K-Pcrt£-otW in the strain was confirmed by 
PCR. Carotenoid production by strains Rl 14 (host control), Rl 14/pBBR-K (empty vector 
15 control) and Rl 14/ pBBR-K-Po*£-crfW was measured in shake flask cultures as described 
in Examples 1 and 2, except that liquid 362F/2 medium was used instead of ME medium. 
These results are shown in Table 31. The control strains Rl 14 and Rl 14/pBBR-K pro- 
duced only zeaxanthin. In strain Rl 14/ pBBR-K-Pcrt£-crtW, the zeaxanthin was complete- 
ly consumed by the plasmid-encoded 0-carotene p-4 oxygenase. However, although asta- 
20 xanthin was produced, two other ketocarotenoids, adonixanthin and canthaxanthin, 
accumulated at higher levels. This indicated an imbalance in vivo of the f$-carotene 
hydroxylase (encoded by the chromosomal crtZ gene in strain R114) and the cloned 
(3 carotene (3-4 oxygenase (CrtW). 

To test this hypothesis, two new plasmids were created that contained the crtZ and crtW 
25 genes together in mini-operons. The order of the genes was made different in the two 

constructs (i.e., crtZ-crtW and crtW-crtZ) to try and create different ratios of expression of 
the crtZ and crt W genes. The construction of the new plasmids required the assembly of a 
special set of cloning vectors as follows. A series of operon construction vectors (based on 
the vector pCR*2.1-TOPO) was designed to facilitate the assembly of genes (in this case, 
30 crtZ and crtW) into operons. The genes of interest must have an ATG start codon, 

embedded in an Ndel site (CATATG), and a TGA stop codon immediately followed by a 
BamHl site. 
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Table 31. Astaxanthin production in Paracoccus sp. strain R114 containing plasmids 
expressing the crfW gene alone and in combination with the crtZ gene. 



Strain 



R114 

RHiTpBBR-K 



Rl 14/pBBR-K-Pcrf£ -crrW 0 



Rl 14/pBBR-K-Pcrt£ -crtWZ 0 



R114/pBBR-K-Pcrr£-crfZW 0 



R114 



R114/pBBR-K 



Rl 14/pBBR-K-Pcrf£ -crtW 0 



Rl 14/pBBR-K-Pcrf£ -crtWZ 0 



24 hours 



ZXN ADN CXN 



13.0 



14.9 



18.0 



21.8 



29.5 



20.4 



2.3 



1.3 



7.3 



Total Sp. Form.' 



46.5 



41.4 



37.5 



45.6 



45.65 



2.2 



2.1 



2.1 



2.1 



48 hours 



72.6 
70T 



0 
"0 



26.7 
30.9 



R114/pBBR-K-Pcrf£ -crtZW 



0 



15.7 



11.2 



58.3 



72 hours 



R114 



82.5 



R114/pBBR-K 



82.9 



R114/pBBR-K-Pcrr£ -crtW 



0 



0 



0 



19.7 




0 



0 



17.0 



0 



0 



46.8 



82.5 



82.9 



83.5 



5.1 



5.2 



10 



a ZXN, zeaxanthin; AND, adonixanthin; CXN, canthaxanthin; AXN, astaxanthin. v 
b Specific Formation, expressed as mg/1 total carotenoid/OD 6 60- 

Furthermore, the first nucleotide after the start codon and the last nucleotide before the 
stop codon must be adenine and the gene must lack sites for at least one of the enzymes 
Bsgl, BseMII, BseRI and Gsul. Four operon construction vectors were constructed, 
differing in the arrangements of their polylinker sequences (SEQ ID NOs: 190-197). The 
cleavage sites of the first two enzymes are within the Ndel site. The cleavage sites of the 
last two enzymes are before the BamHI site. The BseRI site in pOCV-1 and pOCV-4 is not 
unique and cannot be used for operon construction. 

The genes to be assembled in operons are first inserted individually between the Ndel and 
the BamHI sites of the appropriate operon construction vectors. The resulting plasmid 
with the upstream gene of the envisioned operon is then cut with one of the two enzymes 
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at the end of the polylinker and with an enzyme, which has a unique site within the vector 
backbone. The plasmid containing the downstream gene of the envisioned operon is cut 
with one of the first two enzymes of the polylinker and with the same enzyme (with a 
unique site in the vector backbone) used for the first plasmid (containing the desired 
5 upstream gene). The fragments carrying the genes are isolated and ligated, resulting in a 
pOCV plasmid with both genes between the Ndel and the BamHl sites. More genes can be 
added in an analogous fashion. The assembled genes overlap such that the first two 
nucleotides, TG, of the TGA stop codon of the upstream gene coincide the last two 
nucleotides of the ATG start codon of the downstream gene. The same overlap is found 
10 between all genes in the carotenoid (crt) operon (crtZYIB) in Paracoccus sp. strain R1534 
(Pasamontes et al., supra). 

The pOCV backbone is derived from pCR*2.1-TOPO. The BseMII site in the region 
necessary for replication, upstream of the ColEl origin, was eliminated by site directed 
mutagenesis changing the site from CTCAG into CACAG. The remaining three JSseMII 
15 sites and one Gsul site were eliminated by removing a 0.8 kb Ddel-Asp700 fragment. The 
remaining vector was blunt-end ligated after fill-in of the Ddel recessed end. The poly- 
linkers were inserted between the BamHl and Xbal sites by means of annealed oligo- 
nucleotides with the appropriate 5* overhangs. 

Plasmid pBBR-K-Pcr*£-o*ZW, was constructed using the operon construction vector 
20 pOCV-2 as follows. The crtZ gene was amplified by PCR from Paracoccus sp. strain Rl 14 
using the primers crtZ-Nde and crtZ-Bam (Table 30). The primers were designed such 
that the ATG start codon constitutes the second half of a Ndel site (cleavage recognition 
site CAT ATG) and a BamHl site (GGATCC) was introduced immediately after the stop 
codon. The PCR product was cloned in the pCR*2.1-TOPO vector, resulting in plasmid 
25 TOPO-crtZ. To assemble the two genes in a mini-operon, both genes, crtZ and crt W were 
excised with Ndel and BamHl from the plasmids TOPO-crfZ and TOPO-crtWand sub- 
cloned in the Ndel-BamHl cut vector pOCV-2, creating plasmids pOCV-2-crfZ and 
pOCV-2-crtW. Plasmid pOCV-2-crrZ was cut with BseMll and Pstl (there is a unique Pstl 
site in the kanamycin resistance gene) and the 2.4 kb fragment (containing crtZ) was 
30 ligated with the crt W-containing 1876 bp BseRl-Pstl fragment from pOCV-2-crtW. The 
resulting plasmid, pOCV-2-crfZW, was cut with Ndel and BamHl and the crtZW fragment 
was ligated with the Ndel-BamHl backbone of pBBR-K-Pcrf£ to yield pBBR-K-Pcrt£- 
crtZW. Plasmid pBBR-K-Pcrf£-crtWZ, was constructed in an analogous fashion. 

The data in Table 31 show that the ratio of adonixanthin, canthaxanthin and astaxanthin 
35 did not change appreciably in strain Rl 14/ pBBR-K-Pcrf£-crf WZ compared to strain 
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pBBR-K-PcrtE-crtW. However, in strain pBBR-K-P crtE-crtZW y the production of the 
ketocarotenoids was shifted in favor of astaxanthin. This result indicates that the level of 
expression is dependent on the position of the gene within the mini-operon, and suggests 
that increasing the in vivo level of p-carotene hydroxylase activity creates a balance 
5 between the activities of this enzyme and |3-carotene (3-4 oxygenase that is more favorable 
for full conversion of zeaxanthin to astaxanthin. 

The results described in this Example also show that it is possible, through appropriate 
genetic engineering, to produce not only astaxanthin, but also other ketocarotenoids of 
commercial interest in Paracoccus sp. strain R114 or its relatives. For example, expression 

10 of a gene coding for ^-carotene (5-4 oxygenase in a crtZ mutant of strain Rl 14 (lacking |3- 
carotene hydroxylase activity) would provide for production of exclusively ketocaroten- f 
oids, e.g.y echinenone or canthaxanthin, without co-production of hydroxylated caroten- 
oids. Taken together, the results presented in this Example and Example 1 1 show the 
broad utility of Paracoccus sp. strain Rl 14 and its relatives to produce industrially im- 

15 portant carotenoids. 

Example 13: Accumulation of mevalonate in cultures of Paracoccus sp. strain Rl 14 

over expressing genes of the mevalonate pathway 
Overexpression of the genes of the mevalonate pathway in Paracoccus sp. strain Rl 14 leads 
to increased carbon flow to through the mevalonate pathway. The construction of plasmid 

20 pBBR-K-mev-opl6-2 was described in Example 5. Plasmid pBBR-K-mev-op-up-4 was 
constructed as follows. A DNA fragment containing containing most of the mvaA gene 
and the entire idi and hcs genes was obtained on a 3.1 kb Smal-Sall fragment following 
partial digestion of a X-clone containing the Paracoccus sp. strain R114 mevalonate operon 
(see Example 4). This fragment was subcloned in pUC19, yielding the plasmid 

25 pUC19mev-op-up\ To facilitate subcloning, the Kpnl-Hindlll fragment of pUC19rnev- 
op-up' containing the mevalonate genes was recloned in the vector pBluescriptKS + , 
resulting in plasmid pBluKSp-mev-op-up'. A 1.7 kb Sail fragment from pUCl9mev-op- 
up* was then cloned in the Sail site of plasmid 2ES2-1, which is a pUC19-derived plasmid 
containing the cloned Sall-EcoRl fragment M from Paracoccus sp. strain R114 (refer to 

30 Example 4). This resulted in plasmid pUCl9mev-op-up-2. Plasmid pUCmev-op-up-3 
was then obtained by combining the Bbsl-Bsal fragment from pUC19mev-op-up-2 
carrying the beginning of the mevalonate operon with the Bbsl-Bsal fragment from 
pBluKSp-mev-op-up > containing idi and hcs . Separately, a unique Mlul site was 
introduced between the Nsil and Kpnl sites of the vector pBBRlMCS-2 (refer to Example 

35 5) by inserting an annealed primer containing an Mlul restriction site. The resulting new 
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cloning vector pBBR-K-Mlu was cut with Mlul and Kpnl and the Mlul-Kpnl fragment 
from pUCmev-op-up-3, containing the first three genes of the mevalonate operon, was 
inserted, yielding plasmid pBBR-K-mev-op-up-3. Plasmid pBBR-K-mev-op-up-4 was 
then constructed by insertion of the Smal fragment from plasmid 16SB3, which contains 
5 most of the mvk gene and the 5' end of pmk (plasmid 16SB3 is a pUC19-derived plasmid 
containing the Paracoccus sp. strain Rl 14 Sall-BamHl fragment A; refer to Example 4). 
The insert of plasmid pBBR-K-mev-op-up-4 contains the putative mevalonate operon 
promoter region, the first four genes of the mevalonate operon and the 5' end of pmk . 

Plasmids pBBR-K-mev-opl6-2 and pBBR-K-mev-op-up-4 were each introduced into 
10 Paracoccus sp. strain Rl 14 by electroporation. Production of zeaxanthin and mevalonate 
by the new strains were compared to the control strain Rl 14. The strains were grown in 
baffled shake flasks in liquid medium 362F/2 (see Example 11) for 72 hours. For strains 
R114/pBBR-K-mev-opl6-2 and R114/pBBR-K-mev-op-up-4, kanamycin (50 mg/I) was 
also added to the cultures. The cultivation temperature was 28°C and shaking was at 200 

15 rpm. Zeaxanthin was measured by the method set forth in Example 1, while mevalonate in 
the culture supernatants was measured as follows: A 0.6 ml sample of the culture was 
centrifuged for 4 minutes at 13,000 x g. Four hundred microliters of the supernatant were 
added to 400 microliters of methanol and mixed by vortexing for 1 min. The mixture was 
centrifuged again for 4 minutes at 13,000 x g. The resulting supernatant was then analyzed 

20 directly by gas chromatography (GC) using the method of Lindemann et al. [J. Pharm. 
Biomed. Anal. 9:311-316 (1991)] with minor modification as follows. The GC was a 
Hewlett-Packard 6890plus instrument (Hewlett-Packard, Avondale, PA, USA) equipped 
with a cool-on-column injector and a flame ionization detector. One microliter of sample 
prepared as described above was injected onto a fused silica capillary column (15m length 

25 x 0.32mm ID) coated with a 0.52 micron film of crosslinked modified polyethylene glycol 
(HP-FFAP, Agilent Technologies, USA). Helium was used as the carrier gas at an inlet 
pressure of 0.6 bar. The temperature of the programmable injector was ramped from 82°C 
to 250°C at a rate of 30°C/minute. The column temperature profile was 80°C for 0.5 
minutes, followed by a linear temperature gradient at 15°C/min to 250°C and finally held 

30 at 250°C for 5 minutes. The detector temperature was maintained at 320°C. 

In the first experiment, zeaxanthin and mevalonate production were measured in strains 
R114 and Rl 14/pBBR-K-mev-opl6-2 (Table 32). Both strains produced similar amounts 
of zeaxanthin, but strain Rl 14/pBBR-K-mev-opl6-2 produced a four-fold higher level of 
mevalonate. These results show that overexpression of the genes of the mevalonate path- 
35 way in Paracoccus sp. strain Rl 14 results in increased carbon flow through the mevalonate 
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pathway. The accumulation of mevalonate was expected because strain Rl 14/pBBR-K- 
mev-opl6-2 does not have an overexpressed crtE gene, and the crtE gene product (GGPP 
synthase) is known to be a limiting step in zeaxanthin production in Paracoccus sp. strain 
Rl 14 (see Examples 6 and 8). Ceils having a limiting amount of GGPP synthase, upon 

5 overproduction of the enzymes of the mevalonate pathway, would be expected to accumu- 
late FPP, and it is well known that FPP is a potent inhibitor of mevalonate kinase [Dorsey 
and Porter, J. Biol. Chem. 243:4667-4670 (1968); Gray and Kekwick, BBA 279:290-296 
(1972); HinsonetaL J. Lipids Res. 38:2216-2223 (1997)]. Therefore, accumulation of 
FPP resulting from overexpression of the genes of the mevalonate pathway would cause 

10 inhibition of mevalonate kinase, which in turn is manifested as mevalonate accumulation 
in the culture. 



Table 32. Zeaxanthin and mevalonate production in strains Rl 14 and Rl 14/pBBR-K- 
mev-opl6-2. 



Strain/plasmid 


Mevalonate (mg/1) 


Zeaxanthin(mg/1) 


R114 


50.5 


70.0 


R114/pBBR-K-mev-opl6-2 


208.2 


65.2 



15 In a second experiment, zeaxanthin and mevalonate production were measured in strain 
R114 and two independent isolates of R114/pBBR-K-mev-op-up-4 (Table 33). These 
results again show that overexpression of the genes of the mevalonate pathway increased 
carbon flow through the mevalonate pathway. 



Table 33. Zeaxanthin and mevalonate production in strains Rl 14 and Rl 14/pBBR-K- 



20 mev-op-up-4. 



Strain/plasmid 


Mevalonate (mg/1) 


Zeaxanthin(mg/1) 


R114 


45 


67.5 


R114/pBBR-K-mev-op-up-4 (Isolate 1) 


133.2 


53.7 


R114/pBBR-K-mev-op-up-4 (Isolate 2) 


163.7 


47.6 



BNSOOGIO: <WO. 



WO 02/099095 



PCT/EP02/06171 



- 108 - 

The following biological material was deposited under the terms of the Budapest Treaty 
with the American Type Culture Collection (ATCC) at 10801 University Blvd., Manassas, 
VA 201 10-2201, USA, and were assigned the following accession numbers: 



Strain 


Accession No. 


Date of Deposit 


Paracoccus sp. Rl 14 


PTA-3335 


April 24, 2001 


Paracoccus sp. R1534 


PTA-3336 


April 24, 2001 


Paracoccus sp. R- 1 506 


PTA-3431 


June 5, 2001 



5 All patents, patent applications, and publications cited above are incorporated herein by 
reference in their entirety as if recited in full herein. 



The invention being thus described, it will be obvious that the same may be varied in many 
ways. Such variations are not to be regarded as a departure from the spirit and scope of 
the invention and all such modifications are intended to be included within the scope of 
10 the following claims. 
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What is Claimed Is: 



1. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 340 of SEQ ID NO:43; 
5 (b) an amino acid sequence shown as residues 1 to 349 of SEQ ID NO:45; 

(c) an amino acid sequence shown as residues 1 to 388 of SEQ ID NO:47; 

(d) an amino acid sequence shown as residues 1 to 378 of SEQ ID NO:49; 

(e) an amino acid sequence shown as residues 1 to 305 of SEQ ID NO:51; 

(f) an amino acid sequence shown as residues 1 to 332 of SEQ ID NO:53; 

10 (g) a fragment of an amino acid sequence selected from the group consisting of SEQ ID 
NOs: 43, 45, 47, 49, 51, and 53, wherein said fragment has at least 30 contiguous amino 

acid residues; - 

(h) an amino acid sequence of a fragment of a polypeptide selected from the group con- 
sisting of SEQ ID NOs: 43, 45, 47, 49, 51, and 53, the fragment having the activity of 

15 hydroxymethylglutaryl-CoA reductase (HMG-CoA reductase), isopentenyl diphosphate 
isomerase, hydroxymethylglutaryl-CoA synthase (HMG-CoA synthase), mevalonate 
kinase, phosphomevalonate kinase, or diphosphomevalonate decarboxylase; 

(i) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybridizes 
under stringent conditions to a hybridization probe comprising at least 30 consecutive 

20 nucleotides of SEQ ID NO:42 or a complement of SEQ ID NO:42, wherein the polypeptide 
has the activity of HMG-CoA reductase, isopentenyl diphosphate isomerase, HMG-CoA 
synthase, mevalonate kinase, phosphomevalonate kinase, or diphosphomevalonate decarb- 
oxylase; and 

(j) a conservatively modified variant of SEQ ID NO:43, 45, 47, 49, 51 or 53. ( 

25 2. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 287 of SEQ ID NO: 159; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO: 159; 

(c) an amino acid sequence of a fragment of SEQ ID NO: 159, the fragment having the 
30 activity of farnesyl diphosphate synthase (FPP synthase); 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybridizes 
under stringent conditions to a hybridization probe comprising at least 30 consecutive 
nucleotides spanning positions 295- 1 158 of SEQ ID NO:157 or a complement thereof, 
wherein the polypeptide has the activity of FPP synthase; and 

35 (e) a conservatively modified variant of SEQ ID NO: 159. 
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3. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 142 of SEQ ID NO:160; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO: 160; 

5 (c) an amino acid sequence of a fragment of SEQ ID NO: 160, the fragment having the 
activity of l-deoxyxylulose-5-phosphate synthase (DXPS); 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybridizes 
under stringent conditions to a hybridization probe comprising at least 30 consecutive 
nucleotides spanning positions 1 185-1610 of SEQ ID NO:157 or a complement thereof, 

10 wherein the polypeptide has the activity of DXPS; 

(e) a conservatively modified variant of SEQ ID NO: 160. 

4. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 390 of SEQ ID NO: 178; 
15 (b) at least 30 contiguous amino acid residues of SEQ ID NO: 178; 

(c) an amino acid sequence of a fragment of a polypeptide of SEQ ID NO: 178, the 
fragment having the activity of acetyl-CoA acetyltransferase; 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybridizes 
under stringent conditions to a hybridization probe comprising at least 30 consecutive 

20 nucleotides spanning positions 1-1 170 of SEQ ID NO: 177 or a complement thereof, 
wherein the polypeptide has the activity of acetyl- Co A acetyltransferase; and 

(e) a conservatively modified variant of SEQ ID NO: 178. 

5. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown as residues 1 to 240 of SEQ ID NO:179; 

(b) at least 30 contiguous amino acid residues of SEQ ID NO:179; 

(c) an amino acid sequence of a fragment of a polypeptide of SEQ ID NO: 179, the frag- 
ment having the activity of acetoacetyl-CoA reductase; 

(d) an amino acid sequence of a polypeptide encoded by a polynucleotide that hybridizes 
under stringent conditions to a hybridization probe comprising at least 30 consecutive 
nucleotides spanning positions 1258-1980 of SEQ ID NO:177 or a complement thereof, 
wherein the polypeptide has the activity of acetoacetyl-CoA reductase; and 

(e) a conservatively modified variant of SEQ ID NO: 179. 

6. An isolated polynucleotide sequence comprising a nucleotide sequence selected from the 
35 group consisting of SEQ ID NO:42, variants of SEQ ID NO:42 containing one or more 



c 

25 



30 
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substitutions according to the Paracoccus sp. strain R1534 codon usage table, fragments of 
SEQ ID NO:42 that encode a polypeptide having an activity selected from the group con- 
sisting of hydroxymethylglutaryl-CoA reductase (HMG-CoA reductase), isopentenyl di- 
phosphate isomerase, hydroxymethylglutaryl-CoA synthase (HMG-CoA synthase), 
5 mevalonate kinase, phosphomevalonate kinase, and diphosphomevalonate decarboxylase, 
and polynucleotide sequences that hybridize under stringent conditions to a hybridization 
probe the nucleotide sequence of which consists of at least 30 contiguous nucleotides of 
SEQ ID NO:42, or the complement of SEQ ID NO:42, which polynucleotide encodes a 
polypeptide having an activity selected from the group consisting of HMG-CoA reductase, 
10 isopentenyl diphosphate isomerase, HMG-CoA synthase, mevalonate kinase, 
phosphomevalonate kinase, and diphosphomevalonate decarboxylase. 

7. An isolated polynucleotide sequence comprising a polynucleotide sequence selected ^ 
from the group consisting of the nucleotide sequence of SEQ ID NO: 157, variants of SEQ 
ID NO: 157 containing one or more substitutions according to the Paracoccus sp. strain 

15 R1534 codon usage table, fragments of SEQ ID NO: 157 that encode a polypeptide having 
farnesyl diphosphate (FPP) synthase activity, 1-deoxy-D-xylulose 5-phosphate synthase 
activity or a polypeptide having the activity of XseB, and polynucleotide sequences that 
hybridize under stringent conditions to a hybridization probe the nucleotide sequence of 
which consists of at least 30 contiguous nucleotides of SEQ ID NO: 157, or the complement 

20 of SEQ ID NO: 157, which polynucleotide encodes a polypeptide having an activity selected 
from the group consisting of FPP synthase activity, 1-deoxy-D-xylulose 5-phosphate 
synthase activity, and the activity of XseB. 

8. An isolated polynucleotide sequence comprising a polynucleotide sequence selected { 
from the group consisting of the nucleotide sequence of SEQ ID NO: 177, variants of SEQ 

25 ID NO: 177 containing one or more substitutions according to the Paracoccus sp. strain 
R1534 codon usage table, fragments of SEQ ID NO: 177 that encode a polypeptide having 
an activity selected from the group consisting of acetyl-CoA acetyltransferase and 
acetoacetyl-CoA reductase, and polynucleotide sequences that hybridize under stringent 
conditions to a hybridization probe the nucleotide sequence of which consists of at least 30 

30 contiguous nucleotides of SEQ ID NO: 177, or the complement of SEQ ID NO: 177, which 
polynucleotide encodes a polypeptide having an activity selected from the group consisting 
of acetyl-CoA acetyltransferase and acetoacetyl-CoA reductase. 

9. An isolated polynucleotide sequence comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:42, SEQ ID NO:157, SEQ ID NO:177, and combinations 

35 thereof. 
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10. An expression vector comprising the polynucleotide sequence according to claim 6, 7, 
8 or 9. 

1 1. An expression vector selected from the group consisting of pBBR-K-mev-opl6-l, 
pBBR-K-mev-opl6-2, pDS-mvaA, pDS-idi, pDS-/ics, pDS-mv/c, pDS-pmk, pDS-mvd, pDS- 

5 His-mvoA, pDS-His-irfi, pDS-His-ftcs, pDS-His-mvfc, pDS-His-pmfc, pDS-His-mv<i, 
pBBR-K-Zea4, pBBR-K-Zea4-up, pBBR-K-Zea4-down, pBBR-K-Pm£-crf£-3, pBBR-tK- 
PcrtE-mvaA, pBBR-tK-PcrtE-idi y pBBR-tK-Pcrr£-fccs, pBBR-tK-Pa?£-mvA;, pBBR-tK- 
PcrtE-pmk, pBBR-tK-P crtE-mvd, pBBR-K-PcrfE-mvaA-crt£-3, pDS-His-pfcoA, pBBR-K- 
PcrtE-crtWy pBBR-K- PcrtE-crtWZ, pBBR-K-P crtE-crtZW, and combinations thereof. 

10 12. A cultured cell comprising the polynucleotide sequence according to claim 6, 7, 8 or 9, 
or an expression vector according to claim 10 or 1 1, or a progeny of the cell, wherein the 
cell expresses a polypeptide encoded by the polynucleotide sequence. 

13. A method of producing a carotenoid comprising culturing a cell according to claim 12 
under conditions permitting expression of a polypeptide encoded by the polynucleotide 

15 sequence, and isolating the carotenoid from the cell or the medium of the cell. 

14. A method of making a carotenoid-producing cell comprising: 

(a) introducing into a cell a polynucleotide sequence encoding an enzyme in the mevalon- 
ate pathway, which enzyme is expressed in the cell; and 

(b) selecting a cell containing the polynucleotide sequence of step (a) that produces a 

20 carotenoid at a level that is about 1.1-1,000 times the level of the carotenoid produced by 
the cell before introduction of the polynucleotide sequence. 

15. A method for engineering a bacterium to produce an isoprenoid compound 
comprising: 

(a) culturing a parent bacterium in a medium under conditions permitting expression of 
25 an isoprenoid compound, and selecting a mutant bacterium from the culture medium that 

produces about 1.1-1,000 times more of an isoprenoid compound than the parent 
bacterium; 

(b) introducing into the mutant bacterium an expression vector comprising a 
polynucleotide sequence represented by SEQ ID NO:42 operably linked to an expression 

30 control sequence; and 

(c) selecting a bacterium that contains the expression vector and produces at least about 
1.1 times more of an isoprenoid compound than the mutant in step (a). 
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16. A microorganism of the genus Paracoccus, which microorganism has the following 
characteristics: 

(i) a sequence similiarity to SEQ ID NO:12 of >97% using a similarity matrix obtained 
from a homology calculation using GeneCompar v. 2.0 software with a gap penalty of 0%; 

5 a homology to strain R-1512, R1534, R114 or R-1506 of >70% using DNArDNA 

hybridization at 81.5°C; 

a G+C content of its genomic DNA that varies less than 1% from the G+C content of the 
genomic DNA of R114, R-1512, R1534, and R-1506; and 

an average DNA fingerprint that clusters at about 58% similarity to strains R-1512, R1534, 
10 R114 and R-1506 using the AFLP procedure of Example 2, with the proviso that the 
microorganism is not Paracoccus sp. (MBIC3966); 

(ii) 18:lw7c comprising at least about 75% of the total fatty acids of the cell membranes; ( 
an inability to use adonitol, i-erythritol, gentiobiose, 3-methylglucoside, D-sorbitol, xylitol 
and quinic acid as carbon sources for growth; and 

15 an ability to use L-asparagine and L-aspartic acid as carbon sources for growth, with the 
proviso that the microorganism is not Paracoccus sp. (MBIC3966); or 

(iii) an ability to grow at 40°C; 

an ability to grow in a medium having 8% NaCl; 
an ability to grow in a medium having a pH of 9.1; and 
20 a yellow-orange colony pigmentation, with the proviso that the microorganism is not 
Paracoccus sp. (MBIC3966). 



( 
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SEQUENCE LISTING 



i 

' r 



c 



<110> BERRY, Alan 

BRETZEL , Werner 
HUMBELIN, Markus 
LOPEZ-ULIBARRI, Rual 
MAYER, Anne 
YELISEEV, Alexei 



<12 0> IMPROVED ISOPRENOID PRODUCTION 



<130> C38435/121966 



<160> 197 



<170> Patentln version 3.0 



<210> 1 

<211> 20 

<212> DNA 

<213> synthetic construct 



<400> 1 

agagtttgat cctggctcag 20 

<210> 2 

<211> 20 

<212> DNA 

<213> synthetic construct 



1 
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<220> 

<221> misc_f eature 
<223> n is c or t 



<400> 2 

ctggctcagg angaacgctg 



<210> 3 

<211> 20 

<212> DNA 

<213> synthetic construct 



<400> 3 

aaggaggtga tccagccgca 



<210> 4 

<211> 20 

<212> DNA 

<213> synthetic construct 



<400> 4 

ctcctacggg aggcagcagt 



<210> 5 

<211> 18 

<212> DNA 

<213> synthetic construct 



<400> 5 

cagcagccgc ggtaatac 



<210> 



20 



20 



2 0 ^ 



18 
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<211> 19 
<212> DNA 

<213> synthetic construct 



<400> 6 

aactcaaagg aattgacgg 2.9 

<210> 7 

<211> 20 

<212> DNA 

) <213> synthetic construct 



<400> 7 

agtcccgcaa cgagcgcaac 

<210> 8 

<211> 20 

<212> DNA 

<213> synthetic construct 



20 



<400> 8 

gctacacacg tgctacaatg 20 

<210> 9 

<211> 20 

<212> DNA 

<213> synthetic construct 



<400> 9 

actgctgcct cccgtaggag 20 

<210> 10 
<211> 18 
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<212> DNA 

<213> synthetic construct 



<400> 10 

gtattaccgc ggctgctg 

<210> 11 

<211> 20 

<212> DNA 

<213> synthetic construct 



18 



<400> 11 

gttgcgctcg ttgcgggact 20 

<210> 12 

<211> 1404 

<212> DNA 

<213> Paracoccus sp. R-1512 



<400> 12 

gcggcaggct taacacatgc aagtcgagcg aggtcttcgg acctagcggc ggacgggtga 60 

gtaacgcgtg ggaacgtgcc ctttgctacg gaatagtccc gggaaactgg gtttaatacc 120 

gtatgtgccc tacgggggaa agatttatcg gcaaaggatc ggcccgcgtt ggattaggta 180 

gttggtgggg taatggccta ccaagccgac gatccatagc tggtttgaga ggatgatcag 240 

ccacactggg actgagacac ggcccagact cctacgggag gcagcagtgg ggaatcttag 3 00 

acaatggggg caaccctgat ctagccatgc cgcgtgagtg atgaaggccc tagggttgta 360 

aagctctttc agctgggaag ataatgacgg taccagcaga agaagccccg gctaactccg 420 

tgccagcagc cgcggtaata cggagggggc tagcgttgtt cggaattact gggcgtaaag 480 

cgcacgtagg cggactggaa agttgggggt gaaatcccgg ggctcaacct cggaactgcc 540 

tccaaaacta tcagtctgga gttcgagaga ggtgagtgga ataccgagtg tagaggtgaa 600 

attcgtagat attcggtgga acaccagtgg cgaaggcggc tcactggctc gatactgacg 660 



( 
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o 



c 



ctgaggtgcg 


aaagcgtggg 


gagcaaacag 


gattagatac 


cctggtagtc 


cacgccgtaa 


720 


acgatgaatg 


ccagtcgtcg 


ggttgcatgc 


aattcggtga 


cacacctaac 


ggattaagca 


780 


ttccgcctgg 


ggagtacggt 


cgcaagat ta 


aaactcaaag 


gaattgacgg 


gggcccgcac 


840 


aagcggtgga 


gcatgtggtt 


taattcgaag 


caacgcgcag 


aaccttacca 


acccttgaca 


900 


tccctggaca 


tcccgagaga 


tcgggctttc 


acttcggtga 


ccaggagaca 


ggtgctgcat 


960 


ggctgtcgtc 


agctcgtgtc 


gtgagatgtt 


cggttaagtc 


cggcaacgag 


cgcaacccac 


1020 


gtccctagtt 


gccagcattc 


agttgggcac 


tctatggaaa 


ctgccgatga 


taagtcggag 


1080 


gaaggtgtgg 


atgacgtcaa 


gtcctcatgg 


cccttacggg 


ttgggctaca 


cacgtgctac 


1140 


aatggtggtg 


acagtgggtt 


aatccccaaa 


agccatctca 


gttcggattg 


tcctctgcaa 


1200 


ctcgagggca 


tgaagttgga 


atcgctagta 


atcgcggaac 


agcatgccgc 


ggtgaatacg 


1260 


ttcccgggcc 


ttgtacacac 


cgcccgtcac 


accatgggag 


ttggttctac 


ccgacgacgc 


1320 


tgcgctaacc 


cttcggggag 


gcaggcggcc 


acggtaggat 


cagcgactgg 


ggtgaagtcg 


1380 


taacaaggta 


gccgtagggg 


aacc 








1404 



<210> 13 

<211> 20 

<212> DNA 

<213> synthetic construct 



<400> 13 

tcgtagactg cgtacaggcc 20 

<210> 14 

<211> 14 

<212> DNA 

<213> synthetic construct 



<400> 14 
catctgacgc atgt 



<210> 15 



14 
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<211> 16 
<212> DNA 

<213> synthetic construct 



<400> 15 
gacgatgagt cctgac 

<210> 16 

<211> 14 

<212> DNA 

<213> synthetic construct 



16 



<400> 16 
tactcaggac tggc 



<210> 17 

<211> 17 

<212> DNA 

<213> synthetic construct 



14 



<400> 17 

gactgcgtac aggccca 

<210> 18 

<211> 18 

<212> DNA 

<213> synthetic construct 



17 



<400> 18 

cgatgagtcc tgaccgaa 
<210> 19 



18 
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<211> 18 
<212> DNA 

<213> synthetic construct 



<400> 19 

cgatgagtcc tgaccgac 

<210> 20 

<211> 17 

<212> DNA 

<213> synthetic construct 



<400> 20 

gactgcgtac aggcccc 

<210> 21 

<211> 17 

<212> DNA 

<213> synthetic construct 



<400> 22 

cgatgagtcc tgaccgag 

<210> 23 
<211> 8 



18 



17 



<400> 21 

gactgcgtac aggcccg 17 

<210> 22 

<211> 18 

<212> DNA 

<213> synthetic construct 



18 
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<212> PRT 

<213> Paracoccus sp . R1534 



<220> 

<221> UNSURE 

<222> (2) . . (2) 

<223> Xaa is Leu or lie 



<400> 23 

Ala Xaa lie Lys Tyr Trp Gly Lys 
1 5 

<210> 24 

<211> 27 

<212> DNA 

<213> Paracoccus sp . R1534 



<400> 24 

ccsctgatca artaytgggg baaratc 27 

<210> 25 

<211> 20 

<212> DNA 

<213> Paracoccus sp . R1534 



<400> 25 

gcsctgatca artaytgggg 



<210> 26 

<211> 20 

<212> DNA 

<213> Paracoccus sp. R1534 



20 



8 
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<400> 26 

gcsatcatca artaytgggg 20 

<210> 27 

<211> 17 

<212> DNA 

<213> ParacQccus sp. R1534 



( ) <400> 27 

atcaartayt ggggtaa 17 

<210> 28 

<211> 17 

<212> DNA 

<213> Paracoccus sp. R1534 



<400> 28 

atcaartayt ggggcaa 17 

<210> 29 

<211> 17 

<212> DNA 

<213> Paracoccus sp. R1534 



<400> 29 

atcaartayt gggggaa 17 

<210> 30 

<211> 17 

<212> DNA 

<213> Paracoccus sp . R1534 
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<400> 30 

atcaartayt ggggaaa 



<210> 31 

<211> 8 

<212> PRT 

<213> Paracoccus sp. R1534 



17 



<220> 

<221> UNSURE 

<222> (7) . . (7) 

<223> Xaa is Asn or Gin 



<400> 31 

Thr Met Asp Ala Gly Pro Xaa Val 
1 5 

<210> 32 

<211> 24 

<212> DNA 

<213> Paracoccus sp. R1534 



<220> 

<221> misc_f eature 

<222> (19) . . (19) 

<223> n is a or c 



<220> 

<221> misc_f eature 
<222> (21) . . (21) 



10 
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<223> n is y or r 



<400> 32 

acsatggayg csggbccsna ngts 



24 



<210> 



33 



<211> 



24 



<212> 



DNA 



<213> 



Paracoccus sp . R1534 




<220> 

<221> misc^feature 

<222> (19) . . (19) 

<223> n is.t or g 

<220> 

<221> misc_feature 

<222> (21) . . (21) 

<223> n is r or y 

<400> 33 

tgstacctrc gsccvggsnt ncas 24 

<210> 34 

<211> 17 

<212> DNA 

<213> Paracoccus sp . R1534 



<400> 34 

tggtacctac gsccvgg 



17 



<210> 



35 



11 



BNSDOCID: <WO 



WO 02/099095 PCT/EP02/061 71 



<211> 17 
<212> DNA 

<213> Paracoccus sp. R1534 



<400> 35 

tggtacctgc gsccvgg 1 ? 

<210> 36 

<211> 17 

<212> DNA 

<213> Paracoccus sp. R1534 



<400> 36 

tgctacctac gsccvgg 



<210> 37 

<211> 17 

<212> DNA 

<213> Paracoccus sp. R1534 



<400> 38 

tacctacgsc cvggsttrca 



<210> 39 



17 



<400> 37 

tgctacctgc gsccvgg 17 

<210> 38 

<211> 20 

<212> DNA 

<213> Paracoccus sp. R1534 



20 



12 
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<211> 20 
<212> DNA 

<213> Paracoccus sp. R1534 



<400> 39 

tacctgcgsc cvggsttrca 

<210> 40 
<211> 20 
<212> DNA 




<213> Paracoccus sp . R1534 



c 



<400> 40 

tacctacgsc cvggsgtyca 

<210> 41 

<211> 20 

<212> DNA 

<213> Paracoccus sp. R1534 



20 



<400> 41 

tacctgcgsc cvggsgtyca 2 o 

<210> 42 

<211> 9066 

<212> DNA 

<213> Paracoccus sp . R114 



<220> 

<221> CDS 

<222> (2622) . . (3644) 
<223> mvaA gene 

13 
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<400> 42 
ggatccggca 


gctcgacacg 


ccgcagaacc 


tgtacgaacg 


tcccgccagc 


cgcttcgtcg 


60 


cggaattcgt 


cgggcgcggg 


acggtggtgc 


ccgtgcaggc 


ccatgacggc 


gcgggccgcg 


120 


cccgcatcct 


gggggccgag 


gtggcggtga 


acgccgcccc 


gcaatcgcgc 


tttgtcgatc 


180 


acgtctgcct 


gcgccccgag 


aaccttgcca 


tctccgagac 


gggcgacctg 


cgcgccaagg 


240 


tcaccrcgcgt 


cacctatctt 


ggcgggaaat 


acctgctgga 


aaccgtgctg 


gattgcggca 


300 


cccacrc tQQt 

2? ^3 ^ ^3 


gaccgagacc 


cgcgcccgct 


tcgatacggg 


cgcgcagctt 


ggcctgacca 


360 


tcaacocccc 


ctgggccttt 


gccgaggatt 


gaatggacag 


cgtgaagatc 


ctttcgggca 


420 


t" acfGCG fccraa 


gggccctgcc 


tgcatcaggc 


tggatgtcgg 


cgggatgcgc 


ctgatcctcg 


480 


a. 1 1 a c a a cs a c 


cggcccggac 


gagggcgcgg 


agttcgaccc 


cgcctggctg 


gcggacgcgg 


540 


atacQcrtgct 


gatcacccat 


gaccacgtgg 


accatatcgg 


cggcgcgcgt 


cacgcggtcg 


600 


cocrcggggct 


gccgatccat 


gcgacgcggc 


agacggcggg 


gttgctgccc 


gcgggggcgg 


660 


atctgcgcct 


gctgcccgaa 


cgcggtgtca 


cgcggatcgc 


cggggtcgat 


ctgacgaccg 


720 


atcqcaacgg 


gcatgccgcg 


ggcggcgtct 


ggatgcattt 


cgacatgggc 


gaggggctgt 


780 


tctattccgg 


cgactggtcc 


gaggaatccg 


actggttcgc 


cttcgatccg 


cccccgcctg 


840 


cggggacggc 


gattctcgac 


tgctcctatg 


gcggtttcga 


cgtggcgcaa 


tcggattgca 


900 


tcgcggacct 


ggacgacctg 


ctcgaggtgc 


tgccggggca 


ggtactgctg 


ccggtgccgc 


960 


catccggccg 


cgcggccgag 


ctggccctgc 


ggctgatccg 


ccgccacgga 


ccgggcagcg 


1020 


tgatggtcga 


cgacgcctgc 


ctgccggcca 


tcgcgcaact 


gcccgaggcg 


cgcggactgg 


1080 


cctacgccac 


cgaggcacgc 


tttcttgtct 


gcgacacgcc 


gaacgccgaa 


agccggcgcg 


1140 


gcatggcggc 


atctgcaagc 


atggcgcgat 


gcgggcaggc 


tggggcggga 


cgcgcatgtc 


1200 


gtcttcaccg 




catccataCQ 


cqcgcattct 


gcgaccgccc 


cggcgggcat 


1260 


ttccgccgct 


ggaacgtgca 


tccgccgctg 


cgcgaccagc 


gacggatgct 


ggaacggctg 


1320 


gccgcgcggc 


gctttgcccc 


ggccttctgc 


cccgaccccg 


agatctatct 


ggcgctggac 


1380 


atgggcgcgc 


aggtcttcat 


gcaccaggag 


gtgacgccat 


gatccccgcc 


cgcagcttct 


1440 


gcctgatccg 


ccacggcgaa 


acgaccgcca 


atgcaggggc 


gatcatcgcg 


ggcgcaaccg 


1500 


atgtgcccct 


gacgccaagg 


ggccgcgatc 


aggcccgcgc 


cctggcaggg 


cgcgaatggc 


1560 


catcgggcat 


cgcgctgttc 


gccagcccga 


tgtcgcgtgc 


ccgcgatacc 


gcgctgctgg 


1620 



( \ 



14 
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c 



cctttccggg gcgcgaccac cagcccgaac ccgatctgcg cgaacgcgac tggggcatct 1680 

ccgagggacg ccccgtcgcc gatctgcccc cgcgcgaaat cacgccgcag gggggcgagg 1740 

gctgggacga cgtgatggcc cgcgtggacc gcgcgatccg gcggatctgc gcgacctcgg 1800 

gcgatgcgct gccggtgctg gtctgccatt cgggcgtgat ccgtgccgcg cgcgtgctgt 1860 

ggaccaccgg cgatgcgggc gatcgtccgc ccaacgccac gccgatcctg ttcagcccgg 1920 

acggcgaccg attaaaggaa ggaacgatat gaccgccacc accccctgcg tcgtcttcga 1980 

acgtggacgg cacgcttgcc gaattcgacg ccgaccgcct gggccatctt gtccacggca 2 040 

cgaccaagca ctgggacgcc ttccaccacg cgatggccga cgccccgccc atccccgagg 2100 

tcgcccgcct gatgcgcaag ctgaaggagg ggggcgagac ggtcgtcatc tgctcggggc 2160 

ggccccgcgg ctggcaggat cagacgatcg catggctgcg caagcacgac ctgcccttcg 2220 

acgggatcta tctgcgcccc gaggatcagg acggcgccag cgaccccgag gtcaagcgcc 2280 

gcgccctagc cgagatgcgc gccgacgggc tggcgccctg gctggtcgtg gacgaccggc 2340 

ggtccgtcgt ggatgcctgg cgggccgagg ggctggtctg cctgcaatgc gcgccggggg 2400 

acttctaggg ccgcgcgacg ggggcgcgga caggctgggc gggaaaccgc cccgccacca 2460 

tgtcctgcac gcgtcgaacc gcccgtccga cgccggtttc cgcacggaaa cgcgcggcaa 2520 

gttgacataa cttgcacgcg acgtctcgat tctgcccgcg aagaatgcga tgcatccaga 2580 

tgatgcagaa cgaagaagcg gaagcgcccg tgaaagacca g atg att tec cat acc 263 6 

Met lie Ser His Thr 
1 5 



Cccg gtg ccc acg caa tgg gtc ggc ccg ate ctg ttc cgc ggc ccc gtc 
Pro Val Pro Thr Gin Trp Val Gly Pro He Leu Phe Arg Gly Pro Val 



10 15 20 



ate cag gtc teg ctg gtc gac gaa cgc atg age cgc teg ate gcg ctg 
He Gin Val Ser Leu Val Asp Glu Arg Met Ser Arg Ser lie Ala Leu 
55 60 65 



15 



2684 



gtc gag ggc ccg ate age gcg ccg ctg gec acc tac gag acg ccg etc 2732 
Val Glu Gly Pro He Ser Ala Pro Leu Ala Thr Tyr Glu Thr Pro Leu 

25 30 35 

tgg ccc teg acc gcg egg ggg gca ggg gtt tec egg cat teg ggc ggg 2780 
Trp Pro Ser Thr Ala Arg Gly Ala Gly Val Ser Arg His Ser Gly Gly 
40 45 50 



2828 



egg gcg cat gac ggg gcg gcg gcg acc gec gee tgg cag teg ate aag 287 6 

Arg Ala His Asp Gly Ala Ala Ala Thr Ala Ala Trp Gin Ser He Lys 
70^ 75 80 85 
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gcc cgc cag gaa gag gtc gcg gcc gtg gtc gcc acc acc age cgc ttc 2 924 

Ala Arg Gin Glu Glu Val Ala Ala Val Val Ala Thr Thr Ser Arg Phe 

90 95 100 

gcc cgc ctt gtc gag ctg aat cgc cag ate gtg ggc aac ctg ctt tac 2972 
Ala Arg Leu Val Glu Leu Asn Arg Gin lie Val Gly Asn Leu Leu Tyr 

105 HO 115 

ate cgc ate gaa tgc gtg acg ggc gac gcc teg ggt cac aac atg gtc 3 020 

lie Arg lie Glu Cys Val Thr Gly Asp Ala Ser Gly His Asn Met Val 
120 125 130 

acc aag gcc gcc gag gcc gtg cag ggc tgg ate ctg teg gaa tac ccg 3 068 

Thr Lys Ala Ala Glu Ala Val Gin Gly Trp He Leu Ser Glu Tyr Pro 
135 140 145 

atg ctg gcc tat tec acg ate teg ggg aac ctg tgc acc gac aag aag 3116 
Met Leu Ala Tyr Ser Thr He Ser Gly Asn Leu Cys Thr Asp Lys Lys 
150 155 160 165 

gcg teg gcg gtc aac ggc ate ctg ggc cgc ggc aaa tac gcc gtc gcc 3164 

Ala Ser Ala Val Asn Gly He Leu Gly Arg Gly Lys Tyr Ala Val Ala 

170 175 180 

gag gtc gag ate ccg cgc aag ate ctg acc cgc gtg ctg cgc acc age 3212 

Glu Val Glu He Pro Arg Lys He Leu Thr Arg Val Leu Arg Thr Ser 

185 190 195 

gcc gag aag atg gtc cgc ctg aac tac gag aag aac tat gtc ggg ggt 32 60 

Ala Glu Lys Met Val Arg Leu Asn Tyr Glu Lys Asn Tyr Val Gly Gly 
200 205 210 

acg ctg gcg ggg teg ctg cgc agt gcg aac gcg cat ttc gcc aac atg 3308 

Thr Leu Ala Gly Ser Leu Arg Ser Ala Asn Ala His Phe Ala Asn Met 
215 220 225 

ctg ctg ggc ttc tac ctg gcg acg ggg cag gac gcg gcc aac ate ate 3356 

Leu Leu Gly Phe Tyr Leu Ala Thr Gly Gin Asp Ala Ala Asn He He 
230 235 240 245 

gag gcc age cag ggc ttc gtc cat tgc gag gcc cgc ggc gag gat ctg 3404 

Glu Ala Ser Gin Gly Phe Val His Cys Glu Ala Arg Gly Glu Asp Leu 

250 255 260 

tat ttc teg tgc acg ctg ccc aac etc ate atg ggc teg gtc ggt gcc 3452 

Tyr Phe Ser Cys Thr Leu Pro Asn Leu lie Met Gly Ser Val Gly Ala 

265 270 275 

ggc aag ggc ate ccc teg ate gag gag aac ctg teg egg atg ggc tgc 3500 

Gly Lys Gly He Pro Ser He Glu Glu Asn Leu Ser Arg Met Gly Cys 
280 285 290 

cgc cag ccg ggc gaa ccc ggc gac aac gcg cgc cgt ctt gcg gcg ate 3548 

Arg Gin Pro Gly Glu Pro Gly Asp Asn Ala Arg Arg Leu Ala Ala He 
295 300 305 

tgc gcg ggc gtc gtg ctg tgt ggt gaa ttg teg ctg ctt gcg gcc cag 3596 

16 



( 
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Cys Ala Gly Val Val Leu Cys Gly Glu Leu Ser Leu Leu Ala Ala Gin 
310 315 320 325 

acc aac ccc gga gag ttg gtc cgc acc cac atg gag atg gag cga tga 
Thr Asn Pro Gly Glu Leu Val Arg Thr His Met Glu Met Glu Arg 

330 335 340 



3644 



c 



ccgacagcaa ggatcaccat gtcgcggggc gcaagctgga ccatctgcgt gcattggacg 3704 

acgatgcgga tatcgaccgg ggcgacagcg gcttcgaccg catcgcgctg acccatcgcg 3764 

ccc tgcccga ggtggatttc gacgccatcg acacggcgac cagcttcctg ggccgtgaac 3 824 

tgtccttccc gctgctgatc tcgtccatga ccggcggcac cggcgaggag atcgagcgca 3884 

tcaaccgcaa cctggccgct ggtgccgagg aggcccgcgt cgccatggcg gtgggctcgc 3944 



agcgcgtgat gttcaccgac ccctcggcgc gggccagctt cgacctgcgc gcccatgcgc 



4004 



ccaccgtgcc gctgctggcc aatatcggcg cggtgcagct gaacatgggg ctggggctga 4064 
aggaatgcct ggccgcgatc gaggtgctgc aggcggacgg cctgtatctg cacctgaacc 4124 
ccctgcaaga ggccgtccag cccgaggggg atcgcgactt tgccgatctg ggcagcaaga 4184 
tcgcggccat cgcccgcgac gttcccgtgc ccgtcctgct gaaggaggtg ggctgcggcc 4244 
tgtcggcggc cgatatcgcc atcgggctgc gcgccgggat ccggcatttc gacgtggccg 4304 

gtcgcggcgg cacatcctgg agccggatcg agtatcgccg ccgccagcgg gccgatgacg 4364 

acctgggcct ggtcttccag gactggggcc tgcagaccgt ggacgccctg cgcgaggcgc 4424 

ggcccgcgct tgcggcccat gatggaacca gcgtgctgat cgccagcggc ggcatccgca 4484 

acggtgtcga catggcgaaa tgcgtcatcc tgggggccga catgtgcggg gtcgccgcgc 4544 

ccctgctgaa agcggcccaa aactcgcgcg aggcggttgt atccgccatc cggaaactgc 4604 

atctggagtt ccggacagcc atgttcctcc tgggttgcgg cacgcttgcc gacctgaagg 4664 

acaattcctc gcttatccgt caatgaaagt gcctaagatg accgtgacag gaatcgaagc 4724 

gatcagcttc tacacccccc agaactacgt gggactggat atccttgccg cgcatcacgg 4784 

gatcgacccc gagaagttct cgaaggggat cgggcaggag aaaatcgcac tgcccggcca 4844 

tgacgaggat atcgtgacca tggccgccga ggccgcgctg ccgatcatcg aacgcgcggg 4904 

cacgcagggc atcgacacgg ttctgttcgc caccgagagc gggatcgacc agtcgaaggc 4964 

cgccgccatc tatctgcgcc gcctgctgga cctgtcgccc aactgccgtt gcgtcgagct 5024 

gaagcaggcc tgctattccg cgacggcggc gctgcagatg gcctgcgcgc atgtcgcccg 5084 

caagcccgac cgcaaggtgc tggtgatcgc gtccgatgtc gcgcgctatg accgcgaaag 5144 



17 
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ctcgggcgag 


gcgacgcagg 


gtgcgggcgc 


cgtcgccatc 


CCCy ccdgcg 


v^, v_j c* w V— u a 


5204 


ggtggccgag 


atcggcaccg 


tctcggggct 


gt tcaccgag 


gaca tea L.yy 


civ- ' — * — ^yy^-y 


5264 


gccgaaccac 


cgccgcacgc 


ccctgttcga 


cggcaaggca 


tcgacgccgc 


z-y /~» t - ,=i t~ f* t~ a 
y L Lu ll Lyaa 




cgcgctggtc 


gaggcgtgga 


acgactatcg 


cgcgaatggc 


ggccacgagc 


Y~ rv r* t*r t~ ^ t~ 


_J _3 O *1 


cgcgcatttc 


tgctatcacg 


tgccgttctc 


gcggatgggc 


gagaaggcga 


aCayCCaCCt 


Of* f* *i 


ggccaaggcg 


aacaagacgc 


cggtggacat 


ggggcaggtg 


cagacgggcc 


v-«* ^ ^ ^ ^ ^5 

LyatCt aCaa 


r ^ n a 


ccggcaggtc 


gggaactgct 


ataccgggtc 


gatctacctg 


gcattcgccc 


cgccgctyga 




gaacgctcag 


gaggacctga 


ccggcgcgct 


ggtcggtctg 


ttcagctacg 


gcccgggcgc 




gacgggcgaa 


ttcttcgatg 


cgcggatcgc 


gcccggttac 


cgcgaccacc 


cgtccgcgga 




acgccatcgc 


gaattgctgc 


aggatcgcac 


gcccgtcaca 


tatgacgaat 


acgttgcccc 


D / fi4 


gtgggacgag 


atcgacctga 


cgcagggcgc 


gcccgacaag 


gcgcgcggtc 


gttccaggcL 


D O U ft 


ggcaggtatc 


gaggacgaga 


agcgcatcta 


tgtcgaccgg 


caggcctgaa 


gcaggcgccc 


Dobfl 


atgccccggg 


caagctgatc 


ctgtccgggg 


aacattccgt 


gctctatggt 


gcgcccgcgc 




ttgccatggc 


catcgcccgc 


tataccgagg 


tgtggttcac 


gccgcttggc 


actggcgagg 


CQO/1 

Dyoft 


ggatacgcac 


gacattcgcc 


aatctctcgg 


gcggggcgac 


ctattcgctg 


aagccgccgc 


O Ufift 


cggggttcaa 


gtcgcggctg 


gaccgccggt 


_ A. - * 

tcgagcagtt 


cctgaacggc 


gacccaaagg 


OlUft 


tgcacaaggt 


cctgacccat 


cccgacgatc 


tggcggtcta 


tgcgctggcg 


ccgcttcuyc 


D1D4 


acgacaagcc 


gccggggacc 


gccgcgatgc 


cgggcatcgg 


cgcgatgcac 


cacccgccgc 


GOO A 


gaccgggtga 


gctgggcagc 


cggacggagc 


tgcccatcgg 


cgcgggcatg 


gggccgcc eg 




cggccatcgt 


cgcggccacc 


acggtcctgt 


tcgagacgct 


gctggaccgg 


CCCaaydCyv. 




ccgaacagcg 


cttcgaccgc 


gtccgcttct 


gcgagcggt t 


gaagcacggc 


day y Luy y l. 


6404 


ccatcgacgc 


ggccagcgtc 


gtgcgcggcg 


ggcttgtccg 


cgtgggcggg 


a a n f~t n (~* rr O' 


6464 


gttcgatcag 


cagcttcgat 


ttgcccgagg 


atcacgacct 


ugccgcggga 


Cycyycuyy u 


6524 


actgggtact 


gcacgggcgc 


cccgtcagcg 


ggaccggcga 


atycgccagc 


n n rt t - c* ci r* a n 
gcyyucyLyy 


6584 


cggcgcatgg 


tcgcgatgcg 


gcgctgtggg 


acgccttcgc 


agtctgcacc 


cgcgcgttgg 


6644 


aggccgcgct 


gctgtctggg 


ggcagccccg 


acgccgccat 


caccgagaac 


cagcgcctgc 


6704 


tggaacgcat 


cggcgtcgtg 


ccggcagcga 


cgcaggccct 


cgtggcccag 


atcgaggagg 


6764 


cgggtggcgc 


ggccaagatc 


tgcggcgcag 


gttccgtgcg 


gggcgatcac 


ggeggggegg 


6824 


tcctcgtgcg 


gattgacgac 


gcgcaggcga 


tggcttcggt 


catggcgcgc 


catcccgacc 


6884 
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tcgactgggc gcccctgcgc atgtcgcgca cgggggcggc acccggcccc gcgccgcgtg 6944 

cgcaaccgct gccggggcag ggctgatgga tcaggtcatc cgcgccagcg cgccgggttc 7004 

ggtcatgatc acgggcgaac atgccgtggt ctatggacac cgcgccatcg tcgccgggat 7064 

cgagcagcgc gcccatgtga cgatcgtccc gcgtgccgac cgcatgtttc gcatcacctc 7124 

gcagatcggg gcgccgcagc aggggtcgct ggacgatctg cctgcgggcg ggacctatcg 7184 

cttcgtgctg gccgccatcg cgcgacacgc gccggacctg ccttgcgggt tcgacatgga 7244 

catcacctcg gggatcgatc cgaggctcgg gcttggatcc tcggcggcgg tgacggtcgc 7304 

ctgcctcggc gcgctgtcgc ggctggcggg gcgggggacc gaggggctgc atgacgacgc 7364 

gctgcgcatc gtccgcgcca tccagggcag gggcagcggg gccgatctgg cggccagcct 7424 

gcatggcggc ttcgtcgcct atcgcgcgcc cgatggcggt gccgcgcaga tcgaggcgct 7484 

tccggtgccg ccggggccgt tcggcctgcg ctatgcgggc tacaagaccc cgacagccga 7544 

ggtgctgcgc cttgtggccg atcggatggc gggcaacgag gccgctttcg acgcgctcta 7604 

ctcccggatg ggcgcaagcg cagatgccgc gatccgcgcg gcgcaagggc tggactgggc 7664 

tgcattccac gacgcgctga acgaatacca gcgcctgatg gagcagctgg gcgtgtccga 7724 

cgacacgctg gacgcgatca tccgcgaggc gcgcgacgcg ggcgccgcag tcgccaagat 7784 

ctccggctcg gggctggggg attgcgtgct ggcactgggc gaccagccca agggtttcgt 7844 

gcccgcaagc attgccgaga agggacttgt tttcgatgac tgatgccgtc cgcgacatga 7904 

tcgcccgtgc catggcgggc gcgaccgaca tccgagcagc cgaggcttat gcgcccagca 7964 

^ acatcgcgct gtcgaaatac tggggcaagc gcgacgccgc gcggaacctt ccgctgaaca 8024 

gctccgtctc gatctcgttg gcgaactggg gctctcatac gcgggtcgag gggtccggca 8084 

cgggccacga cgaggtgcat cacaacggca cgctgctgga tccgggcgac gccttcgcgc 8144 

gccgcgcgtt ggcattcgct gacctgttcc ggggggggag gcacctgccg ctgcggatca 8204 

cgacgcagaa ctcgatcccg acggcggcgg ggcttgcctc gtcggcctcg gggttcgcgg 8264 

cgctgacccg tgcgctggcg ggggcgttcg ggctggatct ggacgacacg gatctgagcc 8324 

gcatcgcccg gatcggcagt ggcagcgccg cccgctcgat ctggcacggc ttcgtccgct 8384 

ggaaccgggg cgaggccgag gatgggcatg acagccacgg cgtcccgctg gacctgcgct 8444 

ggcccggctt ccgcatcgcg atcgtggccg tggacaaggg gcccaagcct ttcagttcgc 8504 

gcgacggcat gaaccacacg gtcgagacca gcccgctgtt cccgccctgg cctgcgcagg 8564 
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cggaagcgga 


ttgccgcgtc 


atcgaggatg 


cgaccgccgc 




y y 1 — > — - w ^ 


8624 


gtccgcgggt 


cgaggcgaac 


gcccttgcga 


tgcacgccac 


gatgacggcc 


fr f ft f* ft /** f Off"** 


8684 


cgctctgcta 


cctgacgggc 


ggcagctggc 


aggtgctgga 


acgcctgtgg 


ft f* ft ff 

caggcccgcg 


O ' " " 


cggacgggct 


tgcggccttt 


gcgacgatgg 


atgccggccc 


gdaCgtCaay 


rtaatcttca 


8804 


aggaaagcag 


cgccgccgac 


gtgctgtacc 


tgttccccga 


cgccagcctg 


atcgcgccgt 


8864 


tcgaggggcg 


ttgaacgcgt 


aagacgacca 


ctgggtaagg 


ttctgccgcg 


cgtggtctcg 


8924 


actgcctgca 


aagaggtgct 


tgagttgctg 


cgtgactgcg 


gcggccgact 


tcgtgggact 


8984 


tgcccgccac 


gctgacgcgc 


tggaaacgcg 


cccgcggatt 


acgaccgcgt 


cattgccctg 


9044 


aaccaatttc 


ccgtcggtcg 


ac 








9066 



<210> 43 

<211> 340 

<212> PRT 

<213> Paracoccus sp. R114 



<400> 43 

Met He Ser His Thr Pro Val Pro Thr Gin Trp Val Gly Pro He Leu 
1 5 10 15 

Phe Arg Gly Pro Val Val Glu Gly Pro He Ser Ala Pro Leu Ala Thr 

20 25 30 

Tyr Glu Thr Pro Leu Trp Pro Ser Thr Ala Arg Gly Ala Gly Val Ser 
35 40 45 



Arg His Ser Gly Gly He Gin Val Ser Leu Val Asp Glu Arg Met Ser 
50 55 60 



Arg Ser He Ala Leu Arg Ala His Asp Gly Ala Ala Ala Thr Ala Ala 

70 75 80 



65 



Trp Gin Ser He Lys Ala Arg Gin Glu Glu Val Ala Ala Val Val Ala 

85 90 95 



Thr Thr Ser Arg Phe Ala Arg Leu Val Glu Leu Asn Arg Gin He Val 

100 105 HO 
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c 



Gly Asn Leu Leu Tyr lie Arg lie Glu Cys Val Thr Gly Asp Ala Ser 
115 120 125 



Gly His Asn Met Val Thr Lys Ala Ala Glu Ala Val Gin Gly Trp lie 
130 135 140 



Leu Ser Glu Tyr Pro Met Leu Ala Tyr Ser Thr lie Ser Gly Asn Leu 
145 150 155 160 



Cys Thr Asp Lys Lys Ala Ser Ala Val Asn Gly lie Leu Gly Arg Gly 

165 170 175 



Lys Tyr Ala Val Ala Glu Val Glu lie Pro Arg Lys lie Leu Thr Arg 

180 185 190 



Val Leu Arg Thr Ser Ala Glu Lys Met Val Arg Leu Asn Tyr Glu Lys 
195 200 205 



Asn Tyr Val Gly Gly Thr Leu Ala Gly Ser Leu Arg Ser Ala Asn Ala 
210 215 220 



His Phe Ala Asn Met Leu Leu Gly Phe Tyr Leu Ala Thr Gly Gin Asp 
225 230 235 240 



Ala Ala Asn lie lie Glu Ala Ser Gin Gly Phe Val His Cys Glu Ala 

245 250 255 



Arg Gly Glu Asp Leu Tyr Phe Ser Cys Thr Leu Pro Asn Leu lie Met 

260 265 270 



Gly Ser Val Gly Ala Gly Lys Gly lie Pro Ser lie Glu Glu Asn Leu 
275 280 285 



Ser Arg Met Gly Cys Arg Gin Pro Gly Glu Pro Gly Asp Asn Ala Arg 
290 295 300 



Arg Leu Ala Ala He Cys Ala Gly Val Val Leu Cys Gly Glu Leu Ser 
305 310 315 320 



Leu Leu Ala Ala Gin Thr Asn Pro Gly Glu Leu Val Arg Thr His Met 

325 330 ^ 335 
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Glu Met Glu Arg 

340 

<210> 44 
<211> 9066 
<212> DNA 

<213> Paracoccus sp . R114 
<220> 

<221> CDS 

<222> (3641) . . (4690) 
<223> idi 



<400> 44 



ggatccggca 


gctcgacacg 


ccgcagaacc 


tgtacgaacg 


tcccgccagc 


cgcttcgtcg 


60 


cggaattcgt 


cgggcgcggg 


acggtggtgc 


ccgtgcaggc 


ccatgacggc 


gcgggccgcg 


120 


cccgcatcct 


gggggccgag 


gtggcggtga 


acgccgcccc 


gcaatcgcgc 


tttgtcgatc 


180 


acgtctgcct 


gcgccccgag 


aaccttgcca 


tctccgagac 


gggcgacctg 


cgcgccaagg 


240 


tcgcgcgcgt 


cacctatctt 


ggcgggaaat 


acctgctgga 


aaccgtgctg 


gattgcggca 


300 


cccggctggt 


gaccgagacc 


cgcgcccgct 


tcgatacggg 


cgcgcagctt 


ggcctgacca 


360 


tcaacgcccc 


ctgggccttt 


gccgaggatt 


gaatggacag 


cgtgaagatc 


ctttcgggca 


420 


tgggcgtgaa 


gggccctgcc 


tgcatcaggc 


tggatgtcgg 


cgggatgcgc 


ctgatcctcg 


480 


attgcgggac 


cggcccggac 


gagggcgcgg 


agttcgaccc 


cgcctggctg 


gcggacgcgg 


540 


atgcggtgct 


gatcacccat 


gaccacgtgg 


accatatcgg 


cggcgcgcgt 


cacgcggtcg 


600 


cggcggggct 


gccgatccat 


gcgacgcggc 


agacggcggg 


gttgctgccc 


gcgggggcgg 


660 


atctgcgcct 


gctgcccgaa 


cgcggtgtca 


cgcggatcgc 


cggggtcgat 


ctgacgaccg 


720 


^gtcgcaacgg 


gcatgccgcg 


ggcggcgtct 


ggatgcattt 


cgacatgggc 


gaggggctgt 


780 


tctattccgg 


cgactggtcc 


gaggaatccg 


actggttcgc 


cttcgatccg 


cccccgcctg 


840 


cggggacggc 


gattctcgac 


tgctcctatg 


gcggtttcga 


cgtggcgcaa 


tcggattgca 


900 


tcgcggacct 


ggacgacctg 


ctcgaggtgc 


tgccggggca 


ggtactgctg 


ccggtgccgc 


960 
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catccggccg 


cgcggccgag 


ctggccctgc 


ggctgatccg 


ccgccacgga 


ccgggcagcg 


1020 


tgatggtcga 


cgacgcctgc 


ctgccggcca 


tcgcgcaact 


gcccgaggcg 


cgcggactgg 


1080 


cctacgccac 


cgaggcacgc 


tttcttgtct 


gcgacacgcc 


gaacgccgaa 


agccggcgcg 


1140 


gcatggcggc 


atctgcaagc 


atggcgcgat 


gegggcagge 


tggggcggga 


cgcgcatgtc 


1200 


gtcttcaccg 


ggcacatgaa 


cgtccatgcg 


cgcgcattct 


gcgaccgccc 


eggegggcat 


1260 


ttccgccgct 


ggaacgtgca 


tccgccgctg 


cgcgaccagc 


gaeggatget 


ggaacggctg 


1320 


gccgcgcggc 


gctttgcccc 


ggccttctgc 


cccgaccccg 


agatctatct 


ggcgctggac 


1380 


atgggcgcgc 


aggtc t teat 


gcaccaggag 


gtgacgecat 


gatccccgcc 


cgcagcttct 


1440 


gcctgatccg 


ccacggcgaa 


acgaccgcca 


atgeagggge 


gatcatcgeg 


ggcgcaaccg 


1500 


atgtgcccct 


gaegecaagg 


ggccgcgatc 


aggcccgcgc 


cctggcaggg 


cgcgaatggc 


1560 


catcgggcat 


cgcgctgttc 


gccagcccga 


tgtcgcgtgc 


ccgcgatacc 


gcgctgctgg 


1620 


cctttccggg 


gcgcgaccac 


cagcccgaac 


ccgatctgcg 


cgaacgcgac 


tggggcatct 


1680 


tcgagggacg 


ccccgtcgcc 


gatctgcccc 


cgcgcgaaat 


cacgccgcag 


gggggegagg 


1740 


gctgggacga 


cgtgatggcc 


cgcgtggacc 


gcgcgatccg 


gcggatctgc 


gcgacctcgg 


1800 


gcgatgcgct 


gccggtgctg 


gtctgecatt 


egggegtgat 


ccgtgccgcg 


cgcgtgctgt 


1860 


ggaccaccgg 


egatgeggge 


gatcgtccgc 


ccaacgccac 


gccgatcctg 


ttcagcccgg 


1920 


acggcgaccg 


attaaaggaa 


ggaacgatat 


gaccgccacc 


accccctgcg 


tegtcttega 


1980 


acgtggacgg 


cacgcttgcc 


gaattcgacg 


ccgaccgcct 


gggecatett 


gtccacggca 


2040 


cgaccaagca 


ctgggacgcc 


ttccaccacg 


egatggcega 


cgccccgccc 


atccccgagg 


2100 


tcgcccgcct 


gatgegcaag 


ctgaaggagg 


ggggegagae 


ggtegtcate 


tgetegggge 


2160 


ggccccgcgg 


ctggcaggat 


cagacgatcg 


catggctgcg 


caagcacgac 


ctgcccttcg 


2220 


acgggatcta 


tctgcgcccc 


gaggatcagg 


acggcgccag 


cgaccccgag 


gtcaagcgcc 


2280 


gcgccctagc 


egagatgege 


gccgacgggc 


tggcgccctg 


gctggtcgtg 


gacgaccggc 


2340 


ggtccgtcgt 


ggatgcctgg 


egggecgagg 


ggctggtctg 


cctgcaatgc 


gcgccggggg 


2400 


acttctaggg 


ccgcgcgacg 


ggggcgcgga 


caggctgggc 


gggaaacege 


cccgccacca 


2460 


tgtcctgcac 


gcgtcgaacc 


gcccgtccga 


cgccggtttc 


cgcacggaaa 


cgcgcggcaa 


2520 


gttgacataa 


cttgcacgcg 


aegtctcgat 


tctgcccgcg 


aagaatgega 


tgcatccaga 


2580 


tgatgcagaa 


egaagaageg 


gaagcgcccg 


tgaaagacca 


gatgatttcc 


cataccccgg 


2640 
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tgcccacgca 


atgggtcggc 


ccgatcctgt 


tccgcggccc 


cgtcgtcgag 


ggcccgatca 


270.0 


gcgcgccgct 


ggccacctac 


gagacgccgc 


tctggccctc 


gaccgcgcgg 


ggggcagggg 


2760 


tttcccggca 


ttcgggcggg 


atccaggtct 


cgctggtcga 


cgaacgcatg 


agccgctcga 


2820 


tcgcgctgcg 


ggcgcatgac 


ggggcggcgg 


cgaccgccgc 


ctggcagtcg 


atcaaggccc 


2880 


gccaggaaga 


ggtcgcggcc 


gtggtcgcca 


ccaccagccg 


cttcgcccgc 


cttgtcgagc 


2940 


taaatcgcca 


gatcgtgggc 


aacctgcttt 


acatccgcat 


cgaatgcgtg 


aegggegacg 


3000 


cctcgggtca 


caacatggtc 


accaaggccg 


ccgaggccgt 


gcagggctgg 


atectgtegg 


3060 


aataccccrat 


gc'tggcctat 


tccacgatct 


cggggaacct 


gtgcaccgac 


aagaaggcgt 


3120 


cacrcacftcaa 


cggcatcctg 


ggccgcggca 


aatacgccgt 


cgccgaggtc 


gagatcccgc 


3180 


acaagatcct 


gacccgcgtg 


ctgcgcacca 


gcgccgagaa 


gatggtccgc 


ctgaactacg 


3240 


agaagaacta 


tgtcgggggt 


acgctggcgg 


ggtcgctgcg 


cagtgcgaac 


gegcattteg 


3300 


ccaacatgct 


gctgggcttc 


tacctggcga 


cggggcagga 


cgcggccaac 


atcatcgagg 


3360 


ccagccaggg 


cttcgtccat 


tgcgaggccc 


gcggcgagga 


tctgtatttc 


tcgtgcacgc 


3420 


tgcccaacct 


catcatgggc 


tcggtcggtg 


ccggcaaggg 


catcccctcg 


atcgaggaga 


3480 


acctgtcgcg 


gatgggctgc 


cgccagccgg 


gcgaacccgg 


cgacaacgcg 


cgccgtcttg 


3540 


cggcgatctg 


cgcgggcgtc 


gtgctgtgtg 


gtgaattgtc 


gctgcttgcg 


gcccagacca 


3600 


accccggaga 


gttggtccgc 


acccacatgg 


agatggagcg 


atg acc gac age aag 


3655 



gat cac cat gtc gcg ggg cgc aag ctg gac cat ctg cgt gca ttg gac 3703 
Asp His His Val Ala Gly Arg Lys Leu Asp His Leu Arg Ala Leu Asp 

10 15 20 



gac gat gcg gat ate gac egg ggc gac age ggc ttc gac cgc ate gcg 

Asp Asp Ala Asp He Asp Arg Gly Asp Ser Gly Phe Asp Arg He Ala 

25 30 35 

ctg acc cat cgc gee ctg ccc gag gtg gat ttc gac gee ate gac acg 

Leu Thr His Arg Ala Leu Pro Glu Val Asp Phe Asp Ala He Asp Thr 

40 45 50 

gcg acc age ttc ctg ggc cgt gaa ctg tec ttc ccg ctg ctg ate teg 

Ala Thr Ser Phe Leu Gly Arg Glu Leu Ser Phe Pro Leu Leu He Ser 
55 60 65 

tec atg acc ggc ggc acc ggc gag gag ate gag cgc ate aac cgc aac 

Ser Met Thr Gly Gly Thr Gly Glu Glu He Glu Arg He Asn Arg Asn 

70 75 80 85 
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3751 



3799 



3847 



3895 



ctg gee get ggt gee gag gag gee cgc gtc gee atg gcg gtg ggc teg 3 943 
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c 



Leu Ala Ala Gly Ala Glu Glu Ala Arg Val Ala Met Ala Val Gly Ser 

90 95 100 

cag cgc gtg atg ttc acc gac ccc teg gcg egg gee age "ttc gac ctg 3991 
Gin Arg Val Met Phe Thr Asp Pro Ser Ala Arg Ala Ser Phe Asp Leu 

105 110 115 

cgc gee cat gcg ccc acc gtg ccg ctg ctg gec aat ate ggc gcg gtg 4039 
Arg Ala His Ala Pro Thr Val Pro Leu Leu Ala Asn lie Gly Ala Val 
120 125 130 

cag ctg aac atg ggg ctg ggg ctg aag gaa tgc ctg gee gcg ate gag 4087 
Gin Leu Asn Met Gly Leu Gly Leu Lys Glu Cys Leu Ala Ala lie Glu 
.135 140 145 

gtg ctg cag gcg gac ggc ctg tat ctg cac ctg aac ccc ctg caa gag 4135 
Val Leu Gin Ala Asp Gly Leu Tyr Leu His Leu Asn Pro Leu Gin Glu 
150 155 160 165 

gee gtc cag ccc gag ggg gat cgc gac ttt gee gat ctg ggc age aag 4183 
Ala Val Gin Pro Glu Gly Asp Arg Asp Phe Ala Asp Leu Gly Ser Lys 

170 175 180 

ate gcg gec ate gee cgc gac gtt ccc gtg ccc gtc ctg ctg aag gag 4231 
lie Ala Ala lie Ala Arg Asp Val Pro Val Pro Val Leu Leu Lys Glu 

185 190 195 

gtg ggc tgc ggc ctg teg gcg gee gat ate gec ate ggg ctg cgc gee 4279 
Val Gly Cys Gly Leu Ser Ala Ala Asp lie Ala lie Gly Leu Arg Ala 
200 205 210 

ggg ate egg cat ttc gac gtg gec ggt cgc ggc ggc aca tec tgg age 4327 
Gly lie Arg His Phe Asp Val Ala Gly Arg Gly Gly Thr Ser Trp Ser 
215 220 225 

egg ate gag tat cgc cgc cgc cag egg gec gat gac gac ctg ggc ctg 4375 
Arg lie Glu Tyr Arg Arg Arg Gin Arg Ala Asp Asp Asp Leu Gly Leu 
230 235 240 245 

gtc ttc cag gac tgg ggc ctg cag acc gtg gac gee ctg cgc gag gcg 4423 
Val Phe Gin Asp Trp Gly Leu Gin Thr Val Asp Ala Leu Arg Glu Ala 

250 255 260 

egg ccc gcg ctt gcg gee cat gat gga acc age gtg ctg ate gec age 4471 
Arg Pro Ala Leu Ala Ala His Asp Gly Thr Ser Val Leu lie Ala Ser 

265 270 275 

ggc ggc ate cgc aac ggt gtc gac atg gcg aaa tgc gtc ate ctg ggg 4519 
Gly Gly lie Arg Asn Gly Val Asp Met Ala Lys Cys Val lie Leu Gly 
280 285 290 

gec gac atg tgc ggg gtc gee gcg ccc ctg ctg aaa gcg gec caa aac 4567 
Ala Asp Met Cys Gly Val Ala Ala Pro Leu Leu Lys Ala Ala Gin Asn 
295 300 305 

teg cgc gag gcg gtt gta tec gee ate egg aaa ctg cat ctg gag ttc 4615 
Ser Arg Glu Ala Val Val Ser Ala He Arg Lys Leu His Leu Glu Phe 
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310 



315 320 325 



egg aca gec atg ttc etc ctg ggt tgc ggc acg ctt gec gac ctg aag 
Arg Thr Ala Met Phe Leu Leu Gly Cys Gly Thr Leu Ala Asp Leu Lys 

330 335 340 

gac aat tec teg ctt ate cgt caa tga aagtgcctaa gatgacegtg 
Asp Asn Ser Ser Leu lie Arg Gin 

345 



4663 



4710 



ciuay y an v, v_ y 


aacrccratcacr 


cttctacacc 


ccccagaact 


acgtgggact 


ggatatcctt 


4770 




acaQQatcga 


ccccgagaag 


ttctcgaagg 


ggategggea 


ggagaaaatc 


4830 




accatcraccra 

**** ^3 ^3 — 


ggatatcgtg 


accatggccg 


ccgaggccgc 


gctgccgatc 


4890 




caaacacgca 


gggcatcgac 


acggttctgt 


tcgccaccga 


gagegggate 


4950 


yaLLay yea. 


acrcrcccrcccrc 


catctatctg 


cgccgcctgc 


tggacctgtc 


gcccaactgc 


5010 


cy u Ly uy i»vy 


aactcraacrca 


ggectgetat 

-3 w * 


tccgcgacgg 


cggcgctgca 


gatggcctgc 


5070 


gLy uy y 


ccccjcaacrcc 


cgaccgcaag 


gtgctggtga 


tcgcgtccga 


tgtcgcgcgc 


5130 


♦-af crarcaccr 

LCiuya^^y^y 


aaagc teggg 


egaggegacg 


cagggtgegg 


gcgccgtcgc 


catccttgtc 


5190 


^aroccaatc 


ccaaggtggc 


egagategge 


accgtctcgg 


ggctgttcac 


cgaggatatc 


5250 


ataaatttct 


ggcggccgaa 


ccaccgccgc 


acgcccctgt 


tegaeggcaa 


ggcatcgacg 


5310 


ctcrccrctatc 


tgaacgeget 


ggtcgaggcg 


tggaacgact 


ategegegaa 


tggcggccac 


5370 


gagttcgccg 


atttcgegea 


tttctgetat 


cacgtgccgt 


tetcgeggat 


gggcgagaag 


5430 


acaaacagcc 


acctggccaa 


ggcgaacaag 


acgccggtgg 


acatggggca 


ggtgcagacg 


5490 


qgectgatet 

^3 ^3 ^ 


acaaccggca 


ggtegggaac 


tgctataccg 


ggtcgatcta 


cctggcattc 


5550 


gcctCyCLyc 


uy y ciy aowy u 


tcaaaaqgac 


ctgaccggcg 


cgctggtcgg 


tctgttcagc 


5610 


tatggctegg 


gtgcgacggg 


cgaattcttc 


gatgegegga 


tcgcgcccgg 


ttaccgcgac 


5670 


cacctgttcg 


cggaacgcca 


tegegaattg 


ctgeaggate 


gcacgcccgt 


cacatatgac 


5730 


gaatacgttg 


ccctgtggga 


cgagatcgac 


ctgaegcagg 


gcgcgcccga 


caaggcgcgc 


5790 


ggtcgtttca 


ggctggcagg 


tatcgaggac 


gagaagegea 


tctatgtcga 


ccggcaggcc 


5850 


tgaagcaggc 


gcccatgccc 


egggcaaget 


gatcctgtcc 


ggggaacatt 


ccgtgctcta 


5910 


tggtgcgccc 


gcgcttgcca 


tggccatcgc 


ccgctatacc 


gaggtgtggt 


tcacgccgct 


5970 


tggcattggc 


gaggggatac 


gcacgacatt 


cgccaatctc 


tegggegggg 


cgacctattc 


6030 


gctgaagctg 


ctgtcggggt 


teaagtcgeg 


gctggaccgc 


eggttcgage 


agttcctgaa 


6090 


cggcgaccta 


aaggtgcaca 


aggtcctgac 


ccatcccgac 


gatctggegg 


tetatgeget 


6150 
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ggcgtcgctt ctgcacgaca agccgccggg gaccgccgcg atgccgggca tcggcgcgat 6210 

gcaccacctg ccgcgaccgg gtgagctggg cagccggacg gagctgccca tcggcgcggg 627 0 

catggggtcg tctgcggcca tcgtcgcggc caccacggtc ctgttcgaga cgctgctgga 6330 

ccggcccaag acgcccgaac agcgcttcga ccgcgtccgc ttctgcgagc ggttgaagca 6390 

cggcaaggcc ggtcccatcg acgcggccag cgtcgtgcgc ggcgggcttg tccgcgtggg 6450 

cgggaacggg ccgggttcga tcagcagctt cgatttgccc gaggatcacg accttgtcgc 6510 

gggacgcggc tggtactggg tactgcacgg gcgccccgtc agcgggaccg gcgaatgcgt 6570 

cagcgcggtc gcggcggcgc atggtcgcga tgcggcgctg tgggacgcct tcgcagtctg 663 0 

Q cacccgcgcg ttggaggccg cgctgctgtc tgggggcagc cccgacgccg ccatcaccga 6690 

gaaccagcgc ctgctggaac gcatcggcgt cgtgccggca gcgacgcagg ccctcgtggc 6750 

ccagatcgag gaggcgggtg gcgcggccaa gatctgcggc gcaggttccg tgcggggcga 6810 

tcacggcggg gcggtcctcg tgcggattga cgacgcgcag gcgatggctt cggtcatggc 687 0 

gcgccatccc gacctcgact gggcgcccct gcgcatgtcg cgcacggggg cggcacccgg 693 0 

ccccgcgccg cgtgcgcaac cgctgccggg gcagggctga tggatcaggt catccgcgcc 6990 

agcgcgccgg gttcggtcat gatcacgggc gaacatgccg tggtctatgg acaccgcgcc 7050 

atcgtcgccg ggatcgagca gcgcgcccat gtgacgatcg tcccgcgtgc cgaccgcatg 7110 

tttcgcatca cctcgcagat cggggcgccg cagcaggggt cgctggacga tctgcctgcg 717 0 

ggcgggacct atcgcttcgt gctggccgcc atcgcgcgac acgcgccgga cctgccttgc 7230 

v gggttcgaca tggacatcac ctcggggatc gatccgaggc tcgggcttgg atcctcggcg 7290 

gcggtgacgg tcgcctgcct cggcgcgctg tcgcggctgg cggggcgggg gaccgagggg 7350 

ctgcatgacg acgcgctgcg catcgtccgc gccatccagg gcaggggcag cggggccgat 7410 

ctggcggcca gcctgcatgg cggcttcgtc gcctatcgcg cgcccgatgg cggtgccgcg 7470 

cagatcgagg cgcttccggt gccgccgggg ccgttcggcc tgcgctatgc gggctacaag 7530 

accccgacag ccgaggtgct gcgccttgtg gccgatcgga tggcgggcaa cgaggccgct 7590 

ttcgacgcgc tctactcccg gatgggcgca agcgcagatg ccgcgatccg cgcggcgcaa 7650 

gggctggact gggctgcatt ccacgacgcg ctgaacgaat accagcgcct gatggagcag 7710 

ctgggcgtgt ccgacgacac gctggacgcg atcatccgcg aggcgcgcga cgcgggcgcc 7770 

gcagtcgcca agatctccgg ctcggggctg ggggattgcg tgctggcact gggcgaccag 7830 
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cccaagggtt tcgtgcccgc aagcattgcc gagaagggac ttgttttcga tgactgatgc 
cgtccgcgac atgatcgccc gtgccatggc gggcgcgacc gacatccgag cagccgaggc 
ttatgcgccc agcaacatcg cgctgtcgaa atactggggc aagcgcgacg ccgcgcggaa 
ccttccgctg aacagctccg tctcgatctc gttggcgaac tggggctctc atacgcgggt 
cgaggggtcc ggcacgggcc acgacgaggt gcatcacaac ggcacgctgc tggatccggg 
cgacgccttc gcgcgccgcg cgttggcatt cgctgacctg ttccgggggg ggaggcacct 
gccgctgcgg atcacgacgc agaactcgat cccgacggcg gcggggcttg cctcgtcggc 
ctcggggttc gcggcgctga cccgtgcgct ggcgggggcg ttcgggctgg atctggacga 
cacggatctg agccgcatcg cccggatcgg cagtggcagc gccgcccgct cgatctggca 
cggcttcgtc cgctggaacc ggggcgaggc cgaggatggg catgacagcc acggcgtccc 
gctggacctg cgctggcccg gcttccgcat cgcgatcgtg gccgtggaca aggggcccaa 
gcctttcagt tcgcgcgacg gcatgaacca cacggtcgag accagcccgc tgttcccgcc 
ctggcctgcg caggcggaag cggattgccg cgtcatcgag gatgcgatcg ccgcccgcga 
catggccgcc ctgggtccgc gggtcgaggc gaacgccctt gcgatgcacg ccacgatgat 
ggccgcgcgc ccgccgctct gctacctgac gggcggcagc tggcaggtgc tggaacgcct 
gtggcaggcc cgcgcggacg ggcttgcggc ctttgcgacg atggatgccg gcccgaacgt 
caagctgatc ttcgaggaaa gcagcgccgc cgacgtgctg tacctgttcc ccgacgccag 
cctgatcgcg ccgttcgagg ggcgttgaac gcgtaagacg accactgggt aaggttctgc 
cgcgcgtggt ctcgactgcc tgcaaagagg tgcttgagtt gctgcgtgac tgcggcggcc 
gacttcgtgg gacttgcccg ccacgctgac gcgctggaaa cgcgcccgcg gattacgacc 
gcgtcattgc cctgaaccaa tttcccgtcg gtcgac 



7890 

7950 

8010 

8070 

8130 

8190 

8250 

8310 

8370 

8430 

8490 

8550 

8610 

8670 

8730 

8790 

8850 

8910 

8970 

9030 

9066 



<210> 45 



<211> 349 



<212> PRT 



<213> Paracoccus sp. R114 



<400> 45 

Met Thr Asp Ser Lys Asp His His Val Ala Gly Arg Lys Leu Asp His 
1 5 10 15 
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c 



Leu Arg Ala Leu Asp Asp Asp Ala Asp lie Asp Arg Gly Asp Ser Gly 

20 25 30 



Phe Asp Arg lie Ala Leu Thr His Arg Ala Leu Pro Glu Val Asp Phe 
35 40 45 



Asp Ala He Asp Thr Ala Thr Ser Phe Leu Gly Arg Glu Leu Ser Phe 
50 55 60 



Pro Leu Leu He Ser Ser Met Thr Gly Gly Thr Gly Glu Glu He Glu 
65 70 75 80 



Arg He Asn Arg Asn Leu Ala Ala Gly Ala Glu Glu Ala Arg Val Ala 

85 90 95 



Met Ala Val Gly Ser Gin Arg Val Met Phe Thr Asp Pro Ser Ala Arg 

100 105 110 



Ala Ser Phe Asp Leu Arg Ala His Ala Pro Thr Val Pro Leu Leu Ala 
115 120 125 



Asn He Gly Ala Val Gin Leu Asn Met Gly Leu Gly Leu Lys Glu Cys 
130 135 140 



Leu Ala Ala He Glu Val Leu Gin Ala Asp Gly Leu Tyr Leu His Leu 
145 150 155 160 



Asn Pro Leu Gin Glu Ala Val Gin Pro Glu Gly Asp Arg Asp Phe Ala 

165 170 175 



Asp Leu Gly Ser Lys He Ala Ala He Ala Arg Asp Val Pro Val Pro 

180 185 190 



Val Leu Leu Lys Glu Val Gly Cys Gly Leu Ser Ala Ala Asp He Ala 
195 200 205 



lie Gly Leu Arg Ala Gly He Arg His Phe Asp Val Ala Gly Arg Gly 
210 215 220 



Gly Thr Ser Trp Ser Arg lie Glu Tyr Arg Arg Arg Gin Arg Ala Asp 
225 230 235 240 
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Asp Asp Leu Gly Leu Val Phe Gin Asp Trp Gly Leu Gin Thr Val Asp 

245 250 255 



Ala Leu Arg Glu Ala Arg Pro Ala Leu Ala Ala His Asp Gly Thr Ser 

260 265 270 

Val Leu He Ala Ser Gly Gly He Arg Asn Gly Val Asp Met Ala Lys 
275 280 285 

Cys Val He Leu Gly Ala Asp Met Cys Gly Val Ala Ala Pro Leu Leu 
290 295 300 



Lys Ala Ala Gin Asn Ser Arg Glu Ala Val Val Ser Ala He Arg Lys 
305 310 315 320 

Leu His Leu Glu Phe Arg Thr Ala Met Phe Leu Leu Gly Cys Gly Thr 

325 330 335 



Leu Ala Asp Leu Lys Asp Asn Ser Ser Leu He Arg Gin 

340 345 



<210> 46 

<211> 9066 

<212> DNA 

<213> Paracoccus sp. R114 



<220> 

<221> CDS 

<222> (4687) . . (5853) 

<223> hcs 



<400> 46 

ggatccggca gctcgacacg ccgcagaacc tgtacgaacg tcccgccagc cgcttcgtcg 



cccgcatcct gggggccgag gtggcggtga acgccgcccc gcaatcgcgc tttgtcgatc 
acgtctgcct gcgccccgag aaccttgcca tctccgagac gggcgacctg cgcgccaagg 



30 



60 



cggaattcgt cgggcgcggg acggtggtgc ccgtgcaggc ccatgacggc gcgggccgcg 12 0 



180 
240 
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tcgcgcgcgt 


cacctatctt 


ggcgggaaat 


acctgctgga 


aaccgtgctg 


gattgcggca 


300 


cccggrtggt 


gaccgagacc 


cgcgcccgct 


tcgatacggg 


cgcgcagct t 


ggcctgacca 


360 


tcaacgcccc 


ctgggccttt 


gccgaggatt 


gaatggacag 


cgtgaagatc 


ctttcgggca 


420 


tgggcgtgaa 


gggccctgcc 


tgcatcaggc 


tggatgtcgg 


cgggatgcgc 


ctgatcctcg 


480 


attgcgggac 


cggcccggac 


gagggcgcgg 


agttcgaccc 


cgcctggctg 


gcggacgcgg 


540 


atgcggtgct 


gatcacccat 


gaccacgtgg 


accatatcgg 


cggcgcgcgt 


cacgcggtcg 


600 


cggcggggct 


gccgatccat 


gcgacgcggc 


agacggcggg 


gttgctgccc 


gcgggggcgg 


660 


atctcycgcc t 


gctgcccgaa 


cgcggtgtca 


cgcggatcgc 


cggggtcgat 


ctgacgaccg 


720 


atCQcaacgg 


gcatgccgcg 


ggcggcgtct 


ggatgcattt 


cgacatgggc 


gaggggctgt 


780 


tCtat tCCQQ 


caactggtcc 


gaggaatccg 


actggttcgc 


cttcgatccg 


cccccgcctg 


840 


caacraacoac 


qat tc tcgac 


tgctcctatg 


gcggtttcga 


cgtggcgcaa 


tcggattgca 


900 


tcacacracc t 


crcraccjacctcr 


ctcgaggtgc 


tgccggggca 


ggtactgctg 


ccggtgccgc 


960 


catcccrcrccq 


cgcggccgag 


ctggccctgc 


ggctgatccg 


ccgccacgga 


ccgggcagcg 


1020 


tqataqtcqa 


cgacgcc tgc 


ctgccggcca 


tcgcgcaact 


gcccgaggcg 


cgcggactgg 


1080 


cctacgccac 


cgaggcacgc 


tttcttgtct 


gcgacacgcc 


gaacgccgaa 


agccggcgcg 


1140 


gcatggcggc 


atctgcaagc 


atggcgcgat 


gcgggcaggc 


tggggcggga 


cgcgcatgtc 


1200 


gtcttcaccg 


ggcacatgaa 


cgtccatgcg 


cgcgcattct 


gcgaccgccc 


cggcgggcat 


1260 


ttccgccgct 


ggaacgtgca 


tccgccgctg 


cgcgaccagc 


gacggatgct 


ggaacggctg 


1320 


gccgcgcggc 


gctttgcccc 


ggccttctgc 


cccgaccccg 


agatctatct 


ggcgctggac 


1380 


atgggcgcgc 


aggtcttcat 


gcaccaggag 


gtgacgccat 


gatccccgcc 


cgcagcttct 


1440 


gcctgatccg 


ccacggcgaa 


acgaccgcca 


atgcaggggc 


gatcatcgcg 


ggcgcaaccg 


1500 


atgtgcccct 


gacgccaagg 


ggccgcgatc 


aggcccgcgc 


cctggcaggg 


cgcgaatggc 


1560 


catcgggcat 


cgcgctgttc 


gccagcccga 


tgtcgcgcgc 


ccgcga tacc 


gcgctgctgg 


1 tin 


cctttccggg 


gcgcgaccac 


cagcccgaac 


ccgatctgcg 


cgaacgcgac 


tggggcatct 


1680 


tcgagggacg 


ccccgtcgcc 


gatctgcccc 


cgcgcgaaat 


cacgccgcag 


gggggcgagg 


1740 


gctgggacga 


cgtgatggcc 


cgcgtggacc 


gcgcgatccg 


gcggatctgc 


gcgacctcgg 


1800 


gcgatgcgct 


gccggtgctg 


gtctgccatt 


cgggcgtgat 


ccgtgccgcg 


cgcgtgctgt 


1860 


ggaccaccgg 


cgatgcgggc 


gatcgtccgc 


ccaacgccac 


gccgatcctg 


ttcagcccgg 


1920 
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acggcgaccg 


attaaaggaa 


ggaacga tac 


acgtggacgg 


cacgcttgcc 


gaattcgacg 


cgaccaagca 


ctgggacgcc 


ttccaccacg 


tcgcccgcct 


gatgcgcaag 


c tgaaggagg 


ggccccgcgg 


ctggcaggat 


cagacgatcg 


acgggatcta 


tctgcgcccc 


gaggatcagg 


gcgccctagc 


cgagatgcgc 


gccgacgggc 


ggtccgtcgt 


ggatgcctgg 


cgggccgagg 


acttctaggg 


ccgcgcgacg 


ggggcgcgga 


tgtcctgcac 


gcgtcgaacc 


gcccgtccga 


gttgacataa 


cttgcacgcg 


acgtctcgat 


tgatgcagaa 


cgaagaagcg 


gaagcgcccg 


tgcccacgca 


atgggtcggc 


ccgatcctgt 


gcgcgccgct 


ggccacctac 


gagacgccgc 


tttcccggca 


ttcgggcggg 


atccaggtct 


tcgcgctgcg 


ggcgcatgac 


ggggcggcgg 


gccaggaaga 


ggtcgcggcc 


gtggtcgcca 


tgaatcgcca 


gatcgtgggc 


aacctgcttt 


cctcgggtca 


caacatggtc 


accaaggccg 


aatacccgat 


gctggcctat 


tccacgatct 


cggcggtcaa 


cggcatcctg 


ggccgcggca 


gcaagatcct 


gacccgcgtg 


ctgcgcacca 


agaagaacta 


tgtcgggggt 


acgctggcgg 


ccaacatgct 


gctgggcttc 


tacctggcga 


ccagccaggg 


cttcgtccat 


tgcgaggccc 


tgcccaacct 


catcatgggc 


tcggtcggtg 


acctgtcgcg 


gatgggctgc 


cgccagccgg 


cggcgatctg 


cgcgggcgtc 


gtgctgtgtg 


accccggaga 


gttggtccgc 


acccacatgg 
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tccrtc t tcaa 


1980 




neicic^C ri t~_ C t"_ t 

y y y v_ c* ^~ 


otccacaoca 


2040 






atccccaaoo 


2100 


gggg c g^y ac 


y y u ^— y uwau^. 


tcrc tic crcraac 


2160 


CauyyC cycy 


Lctcty v>a^»y Q\> 


ctcrcccttco 


2220 


acggcgccag 


cyaccci-yay 




22 80 


tggcgccccg 


gc cggccytg 


ya^.yci^L>y y ^ 


2340 

<C> »J TX V# 


ggctggtctg 


CCCyCadLyc 


gcycuyyyyy 


2400 


caggc cgggc 


ggyaaav.uy^ 


rprrfpr'arpa 
Lv^cyu<wGUL.a 


2460 


cgccggttcc 


cgcacggaaa 


LyLyuyycaa. 


2520 


tctgcccgcg 


aagaa cgcga 


LyL.aH_L.aya 


2580 


tgaaagacca 


gatgat tccc 




Z. U *l u 


tccgcggccc 


cgccgrcgag 


/T f /— r f— a 

yytCLyd llci 


9700 


tctggccctc 


gaccgcgcgg 


ggggcagygy 




cgccggtcga 


cgaacycauy 


ay LL,yL Lty ci 


2ft 90 


cgaccgccgc 


ccggcagucg 


u LL Ctd y y LL-L 


2880 

i> u u u 


ccaccagccg 


Cutcgcccgc 


V- *- /-ft" O/TJ^PTP" 
LLLyLUyciyu 


2940 


acacccycat: 


cgaauycy uy 


d^yyy Lyauy 


3000 


ccgayyccy t 


gcayyycuyy 


d u L- l_ y u<-yy 


3060 


y» f1f*T S 3 4~ 

cgygyaaccu 


y uy Ldw^yoL 


aay aayyuy 


3120 




<^ rr r* r* rr a Cf cr t" f 


era era tcccCTC 


3180 


ct /■*• i~f z& rr3 3 
yCyLLyayaa 


yc* Lyy n-cy^- 


c! tcraac taccf 


3240 


gytcyc <-g^-y 




acacatttca 


3300 


cggggcagga 






3360 


gcggcgagga 


tctgtatttc 


tcgtgcacgc 


3420 


ccggcaaggg 


catcccctcg 


atcgaggaga 


3480 


gcgaacccgg 


cgacaacgcg 


cgccgtcttg 


3540 


gtgaattgtc 


gctgcttgcg 


gcccagacca 


3600 


agatggagcg 


atgaccgaca 


gcaaggatca 


3660 
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o 



( 



ccatgtcgcg 


gggcgcaagc 


tggaccatct 


gcgtgcattg 


gacgacgatg 


eggatatega 


3720 


ccggggcgac 


agcggcttcg 


accgcatcgc 


gctgacccat 


cgcgccctgc 


ccgaggtgga 


3780 


tttcgacgcc 


atcgacacgg 


cgaccagctt 


cctgggccgt 


gaactgtcct 


tcccgctgct 


3840 


gatctcgtcc 


atgaccggcg 


gcaccggcga 


ggagatcgag 


cgcatcaacc 


gcaacctggc 


3900 


cgctggtgcc 


gaggaggccc 


gcgtcgccat 


ggcggtgggc 


tcgcagcgcg 


tgatgttcac 


3960 


cgacccctcg 


gcgcgggcca 


gcttcgacct 


gcgcgcccat 


gcgcccaccg 


tgccgctgct 


4020 


ggccaatatc 


ggcgcggtgc 


agctgaacat 


ggggctgggg 


ctgaaggaat 


gcctggccgc 


4080 


gatcgaggtg 


ctgcaggcgg 


acggcctgta 


tctgcacctg 


aaccccctgc 


aagaggccgt 


4140 


ccagcccgag 


ggggatcgcg 


actttgccga 


tctgggcagc 


aagatcgegg 


ccatcgcccg 


4200 


cgacgttccc 


gtgcccgtcc 


tgctgaagga 


ggtgggctgc 


ggcctgtcgg 


eggecgatat 


4260 


cgccatcggg 


ctgcgcgccg 


ggatccggca 


tttcgacgtg 


gccggtcgcg 


gcggcacatc 


4320 


ctggagccgg 


atcgagtatc 


gccgccgcca 


gegggecgat 


gacgacctgg 


gectggtett 


4380 


ccaggactgg 


ggcctgcaga 


ccgtggacgc 


cctgcgcgag 


gcgcggcccg 


cgcttgcggc 


4440 


ULa Ly a Ly ya 




tgatcgccag 


cggcggcatc 


cgcaacggtg 


t c era c a t crcic 


4500 


gaaatgcgtc 


atcctggggg 


ccgacatgtg 


cggggtcgcc 


gcgcccctgc 


tgaaagegge 


4560 


ccaaaactcg 


cgcgaggcgg 


ttgtatccgc 


catceggaaa 


ctgcatctgg 


agttccggac 


4620 


agccatgttc 


ctcctgggtt 


gcggcacgct 


tgccgacctg 


aaggacaatt 


cctcgcttat 


4680 


ccgtca atg 
Met 
1 


aaa gtg cct 
Lys Val Pro 


aag atg acc gtg aca gga ate gaa 
Lys Met Thr Val Thr Gly He Glu 
5 10 


gcg ate 
Ala He 


4728 



age ttc tac acc ccc cag aac tac gtg gga ctg gat ate ctt gee gcg 4776 

Ser Phe Tyr Thr Pro Gin Asn Tyr Val Gly Leu Asp He Leu Ala Ala 
15 20 25 30 

cat cac ggg ate gac ccc gag aag ttc teg aag ggg ate ggg cag gag 4824 

His His Gly He Asp Pro Glu Lys Phe Ser Lys Gly He Gly Gin Glu 

35 40 45 

aaa ate gca ctg ccc ggc cat gac gag gat ate gtg acc atg gee gee 4872 

Lys He Ala Leu Pro Gly His Asp Glu Asp He Val Thr Met Ala Ala 

50 55 60 

gag gee gcg ctg ccg ate ate gaa cgc gcg ggc acg cag ggc ate gac 4920 

Glu Ala Ala Leu Pro He lie Glu Arg Ala Gly Thr Gin Gly He Asp 
65 70 75 

acg gtt ctg ttc gee acc gag age ggg ate gac cag teg aag gee gee 4968 

Thr Val Leu Phe Ala Thr Glu Ser Gly He Asp Gin Ser Lys Ala Ala 
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80 85 90 

acc ate tat ctg cgc cgc ctg ctg gac ctg teg ccc aac tgc cgt tgc 5016 

Ala lie Tyr Leu Arg Arg Leu Leu Asp Leu Ser Pro Asn Cys Arg Cys 

95 100 105 HO 

gtc gag ctg aag cag gec tgc tat tec gcg acg gcg gcg ctg cag atg 5064 

Val Glu Leu Lys Gin Ala Cys Tyr Ser Ala Thr Ala Ala Leu Gin Met 

115 120 125 



gec tgc gcg cat gtc gec cgc aag ccc gac cgc aag gtg ctg gtg ate 5112 



Ala Cys Ala His Val Ala Arg Lys Pro Asp Arg Lys Val Leu Val lie 

130 135 140 

gcg tec gat gtc gcg cgc tat gac cgc gaa age teg ggc gag gcg acg 
Ala Ser Asp Val Ala Arg Tyr Asp Arg Glu Ser Ser Gly Glu Ala Thr 
145 150 155 

cag ggt gcg ggc gee gtc gee ate ctt gtc age gec gat ccc aag gtg 
Gin Gly Ala Gly Ala Val Ala lie Leu Val Ser Ala Asp Pro Lys Val 
160 165 170 

qcc gag ate ggc acc gtc teg ggg ctg ttc acc gag gat ate atg gat 
Ala Glu He Gly Thr Val Ser Gly Leu Phe Thr Glu Asp He Met Asp 
175 180 185 190 

ttc tgg egg ccg aac cac cgc cgc acg ccc ctg ttc gac ggc aag gca 
Phe Trp Arg Pro Asn His Arg Arg Thr Pro Leu Phe Asp Gly Lys Ala 

195 200 205 



aag gcg aac aag acg ccg gtg gac atg ggg cag gtg cag acg ggc ctg 
Lys Ala Asn Lys Thr Pro Val Asp Met Gly Gin Val Gin Thr Gly Leu 



255 



260 265 270 



ate tac aac egg cag gtc ggg aac tgc tat acc ggg teg ate tac ctg 
He Tyr Asn Arg Gin Val Gly Asn Cys Tyr Thr Gly Ser He Tyr Leu 

275 280 285 



290 



295 300 



5160 



5208 



5256 



5304 



teg acg ctg cgc tat ctg aac gcg ctg gtc gag gcg tgg aac gac tat 5352 
Ser Thr Leu Arg Tyr Leu Asn Ala Leu Val Glu Ala Trp Asn Asp Tyr 

2io 215 220 

cgc gcg aat ggc ggc cac gag ttc gec gat ttc gcg cat ttc tgc tat 5400 
Arg Ala Asn Gly Gly His Glu Phe Ala Asp Phe Ala His Phe Cys Tyr 
2 25 230 235 

cac gtg ccg ttc teg egg atg ggc gag aag gcg aac age cac ctg gec 5448 
His Val Pro Phe Ser Arg Met Gly Glu Lys Ala Asn Ser His Leu Ala 
240 245 250 



5496 



5544 



gca ttc gee teg ctg ctg gag aac get cag gag gac ctg acc ggc gcg 5592 
Ala Phe Ala Ser Leu Leu Glu Asn Ala Gin Glu Asp Leu Thr Gly Ala 



eta ate ggt ctg ttc age tat ggc teg ggt gcg acg ggc gaa ttc ttc 5640 
Leu Val Gly Leu Phe Ser Tyr Gly Ser Gly Ala Thr Gly Glu Phe Phe 
305 310 315 
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gat gcg egg ate gcg ccc ggt tac cgc gac cac ctg ttc gcg gaa cgc 5688 
Asp Ala Arg lie Ala Pro Gly Tyr Arg Asp His Leu Phe Ala Glu Arg 
320 325 330 



cat cgc gaa ttg ctg cag gat cgc acg ccc gtc aca tat gac gaa tac 
His Arg Glu Leu Leu Gin Asp Arg Thr Pro Val Thr Tyr Asp Glu Tyr 
335 340 345 350 



tat gtc gac egg cag gee tga agcaggcgcc catgccccgg gcaagctgat 
^— Tyr Val Asp Arg Gin Ala 



385 



5736 



gtt gec ctg tgg gac gag ate gac ctg acg cag ggc gcg ccc gac aag 5784 

Val Ala Leu Trp Asp Glu lie Asp Leu Thr Gin Gly Ala Pro Asp Lys 

355 360 365 

gcg cgc ggt cgt ttc agg ctg gca ggt ate gag gac gag aag cgc ate 5832 

Ala Arg Gly Arg Phe Arg Leu Ala Gly lie Glu Asp Glu Lys Arg He 

370 375 380 



5883 



cctgtccggg 


gaacattccg 


tgctctatgg 


tgcgcccgcg 


ettgecatgg 


ccatcgcccg 


5 943 


ctataccgag 


gtgtggttca 


cgccgcttgg 


cattggegag 


gggatacgea 


cgacattcgc 


6003 


caatctctcg 


ggeggggega 


cctattcgct 


gaagc tgctg 


teggggt tea 


agtcgegget 


bUb j 


ggaccgccgg 


ttcgagcagt 


tectgaaegg 


cgacctaaag 


gtgeacaagg 


tcctgaccca 


6123 


tcccgacgat 


ctggcggtct 


atgcgctggc 


gtcgcttctg 


cacgacaagc 


cgccggggac 


6183 


cgccgcgatg 


ccgggcatcg 


gegegatgea 


ccacctgccg 


egacegggtg 


agctgggcag 


6243 


ceggaeggag 


ctgcccatcg 


gegegggcat 


ggggtegtet 


gcggccatcg 


tcgcggccac 


6303 


cacggtcctg 


ttcgagaege 


tgctggaccg 


gcccaagacg 


cccgaacagc 


gcttcgaccg 


6363 


cgtccgcttc 


tgcgagcggt 


tgaagcaegg 


caaggceggt 


cccatcgacg 


cggccagcgt 


6423 


cgtgcgcggc 


gggcttgtcc 


gcgtgggcgg 


gaacgggccg 


ggttcgatca 


gcagcttcga 


6483 


tttgeccgag 


gatcacgacc 


ttgtcgcggg 


aegeggctgg 


tactgggtac 


tgcacgggcg 


6543 


ccccgtcagc 


gggaccggcg 


aatgcgtcag 


cgcggtcgcg 


gcggcgcatg 


gtcgcgatgc 


6603 


ggcgctgtgg 


gacgccttcg 


cagtctgcac 


ccgcgcgt tg 


gaggccgcgc 


tgctgtctgg 


6663 


gggcagcccc 


gacgccgcca 


tcaccgagaa 


ccagcgcctg 


ctggaacgca 


teggegtegt 


6723 


gccggcagcg 


acgcaggccc 


tcgtggccca 


gatcgaggag 


gcgggtggcg 


eggecaagat 


6783 


ctgcggcgca 


ggttccgtgc 


ggggegatea 


eggeggggeg 


gtcctcgtgc 


ggattgacga 


6843 


cgcgcaggcg 


atggcttcgg 


tcatggcgcg 


ccatcccgac 


ctcgactggg 


cgcccctgcg 


6903 


catgtcgcgc 


aegggggegg 


cacccggccc 


cgcgccgcgt 


gcgcaaccgc 


tgeeggggea 


6963 
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gggctgatgg 


atcaggtcat 


ccgcgccagc 


gcgccgggtt 


cggtcatgat 


cacgggcgaa 


7n07 
/UZj 


catgccgtgg 


tctatggaca 


ccgcgccatc 


gtcgccggga 


tcgagcagcg 


cgcccatgtg 


*7 n Q "i 


acgatcgtcc 


cgcgtgccga 


ccgcatgttt 


cgcatcacct 


cgcagatcgg 


ggcgccgcag 


7 143 


caggggtcgc 


tggacgatct 


gcctgcgggc 


gggacctatc 


gcttcgtgct 


ggccgccatc 


7203 


gcgcgacacg 


cgccggacct 


gccttgcggg 


ttcgacatgg 


acatcacctc 


ggggatcgat 


7263 


ccgaggctcg 


ggcttggatc 


ctcggcggcg 


gtgacggtcg 


cctgcctcgg 


cgcgctgtcg 


7323 


cggctggcgg 


ggcgggggac 


cgaggggctg 


catgacgacg 


cgctgcgcat 


cgtccgcgcc 


7383 


atccagggca 


ggggcagcgg 


ggccgatctg 


gcggccagcc 


tgcatggcgg 


cttcgtcgcc 


•■i a a -a 

7443 


tatcgcgcgc 


ccgatggcgg 


tgccgcgcag 


atcgaggcgc 


ttccggtgcc 


gccggggccg 


7503 


ttcggcctgc 


gctatgcggg 


ctacaagacc 


ccgacagccg 


aggtgctgcg 


ccttgtggcc 


7563 


gatcggatgg 


cgggcaacga 


ggccgctttc 


gacgcgctct 


actcccggat 


gggcgcaagc 


7623 


gcagatgccg 


cgatccgcgc 


ggcgcaaggg 


ctggactggg 


ctgcattcca 


cgacgcgctg 


7683 


aacgaatacc 


agcgcctgat 


ggagcagctg 


ggcgtgtccg 


acgacacgct 


ggacgcgatc 


7743 


atccgcgagg 


cgcgcgacgc 


gggcgccgca 


gtcgccaaga 


tctccggctc 


ggggctgggg 


7803 


gattgcgtgc 


tggcactggg 


cgaccagccc 


aagggtttcg 


tgcccgcaag 


cattgccgag 


7863 


aagggacttg 


ttttcgatga 


ctgatgccgt 


ccgcgacatg 


atcgcccgtg 


ccatggcggg 


7923 


cgcgaccgac 


atccgagcag 


ccgaggctta 


tgcgcccagc 


aacatcgcgc 


tgtcgaaata 


7983 


ctggggcaag 


cgcgacgccg 


cgcggaacct 


tccgctgaac 


agctccgtct 


cgatctcgtt 


8043 


ggcgaactgg 


ggctctcata 


cgcgggtcga 


ggggtccggc 


acgggccacg 


acgaggtgca 


8103 


tcacaacggc 


acgctgctgg 


atccgggcga 


cgccttcgcg 


cgccgcgcgt 


tggcattcgc 


8163 


tgacctgttc 


cgggggggga 


ggcacctgcc 


gctgcggatc 


acgacgcaga 


actcgatccc 


0 ZZ J 


gacggcggcg 


gggcttgcct 


cgtcggcctc 


ggggttcgcg 


gcgctgaccc 


gtgcgctggc 


000*3 


gggggcgttc 


gggctggatc 


tggacgacac 


ggatctgagc 


cgcatcgccc 


ggatcggcag 


Q "3 /! *3 


tggcagcgcc 


gcccgctcga 


tctggcacgg 


cttcgtccgc 


tggaaccggg 


gcgaggccga 


8403 


ggatgggcat 


gacagccacg 


gcgtcccgct 


ggacctgcgc 


tggcccggct 


tccgcatcgc 


8463 


gatcgtggcc 


gtggacaagg 


ggcccaagcc 


tttcagttcg 


cgcgacggca 


tgaaccacac 


8523 


ggtcgagacc 


agcccgctgt 


tcccgccctg 


gcctgcgcag 


gcggaagcgg 


attgccgcgt 


8583 


catcgaggat 


gcgatcgccg 


cccgcgacat 


ggccgccctg 


ggtccgcggg 


tcgaggcgaa 


8643 


cgcccttgcg 


atgcacgcca 


cgatgatggc 


cgcgcgcccg 


ccgctctgct 


acctgacggg 


8703 



( 



36 



BNSDOC tD: <W O 02 099095A 2J_> 



WO 02/099095 



PCT/EP02/06171 



( 



cggcagctgg caggtgctgg aacgcctgtg gcaggcccgc gcggacgggc ttgcggcctt 8763 

tgcgacgatg gatgccggcc cgaacgtcaa gctgatcttc gaggaaagca gcgccgccga 8823 

cgtgctgtac ctgttccccg acgccagcct gatcgcgccg ttcgaggggc gttgaacgcg 8883 

taagacgacc actgggtaag gttctgccgc gcgtggtctc gactgcctgc aaagaggtgc 8943 

ttgagttgct gcgtgactgc ggcggccgac ttcgtgggac ttgcccgcca cgctgacgcg 9003 

ctggaaacgc gcccgcggat tacgaccgcg tcattgccct gaaccaattt cccgtcggtc 9063 

gac 9066 

<210> 47 

<211> 388 

<212> PRT 

<213> Paracoccus sp. R114 



<400> 47 

Met Lys Val Pro Lys Met Thr Val Thr Gly He Glu Ala He Ser Phe 
15 10 15 



Tyr Thr Pro Gin Asn Tyr Val Gly Leu Asp He Leu Ala Ala His His 

20 25 30 

Gly He Asp Pro Glu Lys Phe Ser Lys Gly He Gly Gin Glu Lys He 
(_ 35 40 45 

Ala Leu Pro Gly His Asp Glu Asp He Val Thr Met Ala Ala Glu Ala 
50 55 60 

Ala Leu Pro He He Glu Arg Ala Gly Thr Gin Gly He Asp Thr Val 
65 70 75 80 



Leu Phe Ala Thr Glu Ser Gly He Asp Gin Ser Lys Ala Ala Ala He 

85 90 95 



Tyr Leu Arg Arg Leu Leu Asp Leu Ser Pro Asn Cys Arg Cys Val Glu 

100 105 110 



Leu Lys Gin Ala Cys Tyr Ser Ala Thr Ala Ala Leu Gin Met Ala Cys 
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115 



Ala His Val Ala 
130 



Asp Val Ala Arg 
145 



Ala Gly Ala Val 



lie Gly Thr Val 

180 



Arg Pro Asn His 
195 



Leu Arg Tyr Leu 
210 



Asn Gly Gly His 
225 



Pro Phe Ser Arg 



Asn Lys Thr Pro 

260 



Asn Arg Gin Val 
275 



Ala Ser Leu Leu 
290 



Gly Leu Phe Ser 
305 



Arg lie Ala Pro 



Glu Leu Leu Gin 

340 



120 



Arg Lys Pro Asp 
135 



Tyr Asp Arg Glu 
150 



Ala lie Leu Val 
165 



Ser Gly Leu Phe 



Arg Arg Thr Pro 

200 



Asn Ala Leu Val 
215 



Glu Phe Ala Asp 
230 



Met Gly Glu Lys 
245 



Val Asp Met Gly 



Gly Asn Cys Tyr 

280 



Glu Asn Ala Gin 
295 



Tyr Gly Ser Gly 
310 



Gly Tyr Arg Asp 
325 



Asp Arg Thr Pro 



Arg Lys Val Leu 

140 



Ser Ser Gly Glu 
155 



Ser Ala Asp Pro 
170 



Thr Glu Asp lie 
185 



Leu Phe Asp Gly 



Glu Ala Trp Asn 

220 



Phe Ala His Phe 
235 



Ala Asn Ser His 
250 



Gin Val Gin Thr 
265 



Thr Gly Ser lie 



Glu Asp Leu Thr 

300 



Ala Thr Gly Glu 
315 



His Leu Phe Ala 
330 



Val Thr Tyr Asp 
345 



38 



125 



Val lie Ala Ser 



Ala Thr Gin Gly 

160 



Lys Val Ala Glu 
17 5 



Met Asp Phe Trp 
190 



Lys Ala Ser Thr 
205 



Asp Tyr Arg Ala 



Cys Tyr His Val 

240 



Leu Ala Lys Ala 
255 



Gly Leu lie Tyr 
270 



Tyr Leu Ala Phe 
285 



Gly Ala Leu Val 



Phe Phe Asp Ala 

320 



Glu Arg His Arg 
335 



Glu Tyr Val Ala 
350 



niAQQAOKA r> I 
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Leu Trp Asp Glu lie Asp Leu Thr Gin Gly Ala Pro Asp Lys Ala Arg 
355 360 365 



Gly Arg Phe Arg Leu Ala Gly lie Glu Asp Glu Lys Arg lie Tyr Val 
370 375 380 



Asp Arg Gin Ala 
385 



<210> 48 
<211> 9066 



<212> DNA 

<213> Paracoccus sp. R114 



<220> 

<221> CDS 

<222> (5834) . . (6970) 

<223> mvk 



<400> 48 

ggatccggca gctcgacacg ccgcagaacc tgtacgaacg tcccgccagc cgcttcgtcg 60 

cggaattcgt cgggcgcggg acggtggtgc ccgtgcaggc ccatgacggc gcgggccgcg 120 

cccgcatcct gggggccgag gtggcggtga acgccgcccc gcaatcgcgc tttgtcgatc 180 

acgtctgcct gcgccccgag aaccttgcca tctccgagac gggcgacctg cgcgccaagg 240 

tcgcgcgcgt cacctatctt ggcgggaaat acctgctgga aaccgtgctg gattgcggca 300 

cccggctggt gaccgagacc cgcgcccgct tcgatacggg cgcgcagctt ggcctgacca 360 

tcaacgcccc ctgggccttt gccgaggatt gaatggacag cgtgaagatc ctttcgggca 420 

tgggcgtgaa gggccctgcc tgcatcaggc tggatgtcgg cgggatgcgc ctgatcctcg 480 

attgcgggac cggcccggac gagggcgcgg agttcgaccc cgcctggctg gcggacgcgg 54 0 

atgcggtgct gatcacccat gaccacgtgg accatatcgg cggcgcgcgt cacgcggtcg 600 

cggcggggct gccgatccat gcgacgcggc agacggcggg gttgctgccc gcgggggcgg 660 
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atctgcgcct gctgcccgaa cgcggtgtca cgcggatcgc cggggtcgat ctgacgaccg 720 

gtcgcaacgg gcatgccgcg ggcggcgtct ggatgcattt cgacatgggc gaggggctgt 780 

tctattccgg cgactggtcc gaggaatccg actggttcgc cttcgatccg cccccgcctg 840 

cggggacggc gattctcgac tgctcctatg gcggtttcga cgtggcgcaa tcggattgca 900 

tcgcggacct ggacgacctg ctcgaggtgc tgccggggca ggtactgctg ccggtgccgc 960 

catccggccg cgcggccgag ctggccctgc ggctgatccg ccgccacgga ccgggcagcg 1020 

tgatggtcga cgacgcctgc ctgccggcca tcgcgcaact gcccgaggcg cgcggactgg 1080 

cctacgccac cgaggcacgc tttcttgtct gcgacacgcc gaacgccgaa agccggcgcg 1140 

gcatggcggc atctgcaagc atggcgcgat gcgggcaggc tggggcggga cgcgcatgtc 1200 

gtcttcaccg ggcacatgaa cgtccatgcg cgcgcattct gcgaccgccc cggcgggcat 1260 

ttccgccgct ggaacgtgca tccgccgctg cgcgaccagc gacggatgct ggaacggctg 1320 

gccgcgcggc gctttgcccc ggccttctgc cccgaccccg agatctatct ggcgctggac 1380 

atgggcgcgc aggtcttcat gcaccaggag gtgacgccat gatccccgcc cgcagcttct 1440 

gcctgatccg ccacggcgaa acgaccgcca atgcaggggc gatcatcgcg ggcgcaaccg 1500 

atgtgcccct gacgccaagg ggccgcgatc aggcccgcgc cctggcaggg cgcgaatggc 1560 

catcgggcat cgcgctgttc gccagcccga tgtcgcgtgc ccgcgatacc gcgctgctgg 1620 

cctttccggg gcgcgaccac cagcccgaac ccgatctgcg cgaacgcgac tggggcatct 1680 

tcgagggacg ccccgtcgcc gatctgcccc cgcgcgaaat cacgccgcag gggggcgagg 1740 

gctgggacga cgtgatggcc cgcgtggacc gcgcgatccg gcggatctgc gcgacctcgg 1800 

gcgatgcgct gccggtgctg gtctgccatt cgggcgtgat ccgtgccgcg cgcgtgctgt 1860 

ggaccaccgg cgatgcgggc gatcgtccgc ccaacgccac gccgatcctg ttcagcccgg 1920 

acggcgaccg attaaaggaa ggaacgatat gaccgccacc accccctgcg tcgtcttcga 1980 

acgtggacgg cacgcttgcc gaattcgacg ccgaccgcct gggccatctt gtccacggca 2040 

cgaccaagca ctgggacgcc ttccaccacg cgatggccga cgccccgccc atccccgagg 2100 

tcgcccgcct gatgcgcaag ctgaaggagg ggggcgagac ggtcgtcatc tgctcggggc 2160 

ggccccgcgg ctggcaggat cagacgatcg catggctgcg caagcacgac ctgcccttcg 2220 

acgggatcta tctgcgcccc gaggatcagg acggcgccag cgaccccgag gtcaagcgcc 2280 

gcgccctagc cgagatgcgc gccgacgggc tggcgccctg gctggtcgtg gacgaccggc 2340 

ggtccgtcgt ggatgcctgg cgggccgagg ggctggtctg cctgcaatgc gcgccggggg 2400 

40 
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c 



acttctaggg 


ccgcgcgacg 


ggggcgcgga 


caggctgggc 
* 


gggaaaccgc 


cccgccacca 


2460 


tgtcctgcac 


gcgtcgaacc 


gcccgtccga 


cgccggtttc 


cgcacggaaa 


cgcgcggcaa 


2520 


gttgacataa 


cttgcacgcg 


acgtctcgat 


tctgcccgcg 


aagaatgcga 


tgcatccaga 


2580 


tgatgcagaa 


cgaagaagcg 


gaagcgcccg 


tgaaagacca 


gatgatttcc 


cataccccgg 


2640 


tgcccacgca 


atgggtcggc 


ccgatcctgt 


tccgcggccc 


cgtcgtcgag 


ggcccgatca 


2700 


gcgcgccgct 


ggccacctac 


gagacgccgc 


tctggccctc 


gaccgcgcgg 


ggggcagggg 


2760 


tttcccggca 


ttcgggcggg 


atccaggtct 


cgctggtcga 


cgaacgcatg 


agccgctcga 


2820 


tcgcgctgcg 


ggcgcatgac 


ggggcggcgg 


cgaccgccgc 


ctggcagtcg 


atcaaggccc 


2880 


gccaggaaga 


ggtcgcggcc 


gtggtcgcca 


ccaccagccg 


cttcgcccgc 


cttgtcgagc 


2940 


tgaatcgcca 


gatcgtgggc 


aacctgcttt 


acatccgcat 


cgaatgcgtg 


acgggcgacg 


3000 


cctcgggtca 


caacatggtc 


accaaggccg 


ccgaggccgt 


gcagggctgg 


atcctgtcgg 


3060 


aatacccgat 


gctggcctat 


tccacgatct 


cggggaacct 


gtgcaccgac 


aagaaggcgt 


• 3120 


cggcggtcaa 


cggcatcctg 


ggccgcggca 


aatacgccgt 


cgccgaggtc 


gagatcccgc 


3180 


gcaagatcct 


gacccgcgtg 


ctgcgcacca 


gcgccgagaa 


gatggtccgc 


ctgaactacg 


3240 


agaagaacta 


tgtcgggggt 


acgctggcgg 


ggtcgctgcg 


cagtgcgaac 


gcgcatttcg 


3300 


ccaacatgct 


gctgggcttc 


tacctggcga 


cggggcagga 


cgcggccaac 


atcatcgagg 


3360 


ccagccaggg 


cttcgtccat 


tgcgaggccc 


gcggcgagga 


tctgtatttc 


tcgtgcacgc 


3420 


tgcccaacct 


catcatgggc 


tcggtcggtg 


ccggcaaggg 


catcccctcg 


atcgaggaga 


3480 


acctgtcgcg 


gatgggctgc 


cgccagccgg 


gcgaacccgg 


cgacaacgcg 


cgccgtcttg 


3540 


cggcgatctg 


cgcgggcgtc 


gtgctgtgtg 


gtgaattgtc 


gctgcttgcg 


gcccagacca 


3600 


accccggaga 


gttggtccgc 


acccacatgg 


agatggagcg 


atgaccgaca 


gcaaggatca 


3660 


ccatgtcgcg 


gggcgcaagc 


tggaccatct 


gcgtgcattg 


gacgacgatg 


cggatatcga 


3720 


ccggggcgac 


agcggcctcg 


accgcaccgc 


gccgaccca u 


cgcgccctgc 


ccgaggcgga 


J / oU 


tttcgacgcc 


atcgacacgg 


cgaccagctt 


cctgggccgt 


gaactgtcct 


tcccgctgct 


3840 


gatctcgtcc 


atgaccggcg 


gcaccggcga 


ggagatcgag 


cgcatcaacc 


gcaacctggc 


3900 


cgctggtgcc 


gaggaggccc 


gcgtcgccat 


ggcggtgggc 


tcgcagcgcg 


tgatgttcac 


3960 


cgacccctcg 


gcgcgggcca 


gcttcgacct 


gcgcgcccat 


gcgcccaccg 


tgccgctgct 


4020 


ggccaatatc 


ggcgcggtgc 


agctgaacat 


ggggctgggg 


ctgaaggaat 


gcctggccgc 


4080 
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gatcgaggtg 


ctgcaggcgg 


acggcctgta 


tctgcacctg 


aaccccctgc 


aagaggccgt 


4140 


ccagcccgag 


ggggatcgcg 


actttgccga 


tctgggcagc 


aagatcgcgg 


ccatcgcccg 


4200 


cgacgttccc 


gtgcccgtcc 


tgctgaagga 


ggtgggctgc 


ggcctgtcgg 


cggccgatat 


4260 


cgccatcggg 


ctgcgcgccg 


ggatccggca 


tttcgacgtg 


gccggtcgcg 


gcggcacatc 


4320 


ctggagccgg 


atcgagtatc 


gccgccgcca 


gcgggccgat 


gacgacctgg 


gcctggtctt 


4380 


ccaggactgg 


ggcctgcaga 


ccgtggacgc 


cctgcgcgag 


gcgcggcccg 


cgcttgcggc 


4440 


ccatgatgga 


accagcgtgc 


tgatcgccag 


cggcggcatc 


cgcaacggtg 


tcgacatggc 


4500 


gaaatgcgtc 


atcctggggg 


ccgacatgtg 


cggggtcgcc 


gcgcccctgc 


tgaaagcggc 


4560 


ccaaaactcg 


cgcgaggcgg 


ttgtatccgc 


catccggaaa 


ctgcatctgg 


agttccggac 


4620 


agccatgttc 


ctcctgggtt 


gcggcacgct 


tgccgacctg 


aaggacaatt 


cctcgcttat 


4680 


ccgtcaatga 


aagtgcctaa 


gatgaccgtg 


acaggaatcg 


aagcgatcag 


cttctacacc 


4740 


ccccagaact 


acgtgggact 


ggatatcctt 


gccgcgcatc 


acgggatcga 


ccccgagaag 


4800 


ttctcgaagg 


ggatcgggca 


ggagaaaatc 


gcactgcccg 


gccatgacga 


ggatatcgtg 


4860 


accatggccg 


ccgaggccgc 


gctgccgatc 


atcgaacgcg 


cgggcacgca 


gggcatcgac 


4920 


acggttctgt 


tcgccaccga 


gagcgggatc 


gaccagtcga 


aggccgccgc 


catctatctg 


4980 


cgccgcctgc 


tggacctgtc 


gcccaactgc 


cgttgcgtcg 


agctgaagca 


ggcctgctat 


5040 


tccgcgacgg 


cggcgctgca 


gatggcctgc 


gcgcatgtcg 


cccgcaagcc 


cgaccgcaag 


5100 


gtgctggtga 


tcgcgtccga 


tgtcgcgcgc 


tatgaccgcg 


aaagctcggg 


cgaggcgacg 


5160 


cagggtgcgg 


gcgccgtcgc 


catccttgtc 


agcgccgatc 


ccaaggtggc 


cgagatcggc 


5220 


accgtctcgg 


ggctgttcac 


cgaggatatc 


atggatttct 


ggcggccgaa 


ccaccgccgc 


5280 


acgcccctgt 


tcgacggcaa 


ggcatcgacg 


ctgcgctatc 


tgaacgcgct 


ggtcgaggcg 


5340 


tggaacgact 


atcgcgcgaa 


tggcggccac 


gagttcgccg 


atttcgcgca 


tttctgctat 


5400 


cacgtgccgt 


tctcgcggat 


gggcgagaag 


gcgaacagcc 


acctggccaa 


ggcgaacaag 


5460 


acgccggtgg 


acatggggca 


ggtgcagacg 


ggcctgatct 


acaaccggca 


ggtcgggaac 


5520 


tgctataccg 


ggtcgatcta 


cctggcattc 


gcctcgctgc 


tggagaacgc 


tcaggaggac 


5580 


ctgaccggcg 


cgctggtcgg 


tctgttcagc 


tatggctcgg 


gtgcgacggg 


cgaattcttc 


5640 


gatgcgcgga 


tcgcgcccgg 


ttaccgcgac 


cacctgttcg 


cggaacgcca 


tcgcgaattg 


5700 


ctgcaggatc 


gcacgcccgt 


cacatatgac 


gaatacgttg 


ccctgtggga 


cgagatcgac 


5760 


ctgacgcagg 


gcgcgcccga 


caaggcgcgc 


ggtcgtttca 


ggctggcagg 


tatcgaggac 


5820 
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gagaagcgca tct atg teg acc ggc agg cct gaa gca ggc gec cat gec 5869 

Met Ser Thr Gly Arg Pro Glu Ala Gly Ala His Ala 
15 10 

ccg ggc aag ctg ate ctg tec ggg gaa cat tec gtg etc tat ggt gcg 5917 
Pro Gly Lys Leu lie Leu Ser Gly Glu His Ser Val Leu Tyr Gly Ala 
15 20 25 

ccc gcg ctt gee atg gee ate gec cgc tat acc gag gtg tgg ttc acg 5965 
Pro Ala Leu Ala Met Ala lie Ala Arg Tyr Thr Glu Val Trp Phe Thr 
30 35 40 

ccg ctt ggc att ggc gag ggg ata cgc acg aca ttc gee aat etc teg 6013 
Pro Leu Gly lie Gly Glu Gly lie Arg Thr Thr Phe Ala Asn Leu Ser 
45 50 55 60 

ggc ggg gcg acc tat teg ctg aag ctg ctg teg ggg ttc aag teg egg 6061 
Gly Gly Ala Thr Tyr Ser Leu Lys Leu Leu Ser Gly Phe Lys Ser Arg 

65 70 75 

ctg gac cgc egg ttc gag cag ttc ctg aac ggc gac eta aag gtg cac 6109 
Leu Asp Arg Arg Phe Glu Gin Phe Leu Asn Gly Asp Leu Lys Val His 

80 85 90 

aag gtc ctg acc cat ccc gac gat ctg gcg gtc tat gcg ctg gcg teg 6157 
Lys Val Leu Thr His Pro Asp Asp Leu Ala Val Tyr Ala Leu Ala Ser 
95 100 105 

ctt ctg cac gac aag ccg ccg ggg acc gee gcg atg ccg ggc ate ggc 6205 
Leu Leu His Asp Lys Pro Pro Gly Thr Ala Ala Met Pro Gly He Gly 
110 115 120 

gcg atg cac cac ctg ccg cga ccg ggt gag ctg ggc age egg acg gag 6253 
Ala Met His His Leu Pro Arg Pro Gly Glu Leu Gly Ser Arg Thr Glu 
125 130 135 140 

, ctg ccc ate ggc gcg ggc atg ggg teg tct gcg gee ate gtc gcg gee 6301 

i_ Leu Pro He Gly Ala Gly Met Gly Ser Ser Ala Ala lie Val Ala Ala 

145 150 155 

acc acg gtc ctg ttc gag acg ctg ctg gac egg ccc aag acg ccc gaa 6349 
Thr Thr Val Leu Phe Glu Thr Leu Leu Asp Arg Pro Lys Thr Pro Glu 

160 165 170 

cag cgc ttc gac cgc gtc cgc ttc tgc gag egg ttg aag cac ggc aag 6397 
Gin Arg Phe Asp Arg Val Arg Phe Cys Glu Arg Leu Lys His Gly Lys 
175 180 185 

gec ggt ccc ate gac gcg gee age gtc gtg cgc ggc ggg ctt gtc cgc 6445 
Ala Gly Pro He Asp Ala Ala Ser Val Val Arg Gly Gly Leu Val Arg 
190 195 200 

gtg ggc ggg aac ggg ccg ggt teg ate age age ttc gat ttg ccc gag 6493 
Val Gly Gly Asn Gly Pro Gly Ser He Ser Ser Phe Asp Leu Pro Glu 
205 210 215 220 
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gat cac gac ctt gtc gcg gga cgc ggc tgg tac tgg gta ctg cac ggg 
Asp His Asp Leu Val Ala Gly Arg Gly Trp Tyr Trp Val Leu His Gly 

225 230 235 

cgc ccc gtc age ggg ace ggc gaa tgc gtc age gcg gtc gcg gcg gcg 
Arg Pro Val Ser Gly Thr Gly Glu Cys Val Ser Ala Val Ala Ala Ala 

240 245 250 

cat ggt cgc gat gcg gcg ctg tgg gac gee ttc gca gtc tgc acc cgc 
His Gly Arg Asp Ala Ala Leu Trp Asp Ala Phe Ala Val Cys Thr Arg 
255 260 265 

gcg ttg gag gec gcg ctg ctg tct ggg ggc age ccc gac gee gee ate 
Ala Leu Glu Ala Ala Leu Leu Ser Gly Gly Ser Pro Asp Ala Ala lie 
270 275 280 

acc gag aac cag cgc ctg ctg gaa cgc ate ggc gtc gtg ccg gca gcg 
Thr Glu Asn Gin Arg Leu Leu Glu Arg He Gly Val Val Pro Ala Ala 
285 290 295 300 

acg cag gee etc gtg gec cag ate gag gag gcg ggt ggc gcg gee aag 
Thr Gin Ala Leu Val Ala Gin He Glu Glu Ala Gly Gly Ala Ala Lys 

305 310 315 

ate tgc ggc gca ggt tec gtg egg ggc gat cac ggc ggg gcg gtc etc 
He Cys Gly Ala Gly Ser Val Arg Gly Asp His Gly Gly Ala Val Leu 

320 325 330 

gtg egg att gac gac gcg cag gcg atg get teg gtc atg gcg cgc cat 
Val Arg He Asp Asp Ala Gin Ala Met Ala Ser Val Met Ala Arg His 
335 340 345 

ccc gac etc gac tgg gcg ccc ctg cgc atg teg cgc acg ggg gcg gca 
Pro Asp Leu Asp Trp Ala Pro Leu Arg Met Ser Arg Thr Gly Ala Ala 
350 355 360 

ccc ggc ccc gcg ccg cgt gcg caa ccg ctg ccg ggg cag ggc tga 
Pro Gly Pro Ala Pro Arg Ala Gin Pro Leu Pro Gly Gin Gly 
365 370 375 



6541 



6589 



6637 



6685 



6733 



6781 



6829 



6877 



6925 



6970 



7030 



7210 
7270 



tggatcaggt catccgcgcc agcgcgccgg gtteggtcat gatcaeggge gaacatgccg 

tggtctatgg acaccgcgcc atcgtcgccg ggatcgagca gcgcgcccat gtgacgatcg 7090 

tcccgcgtgc cgaccgcatg tttegcatea cctcgcagat cggggcgccg cagcaggggt 7150 

cgctggacga tctgcctgcg ggegggaect ategcttegt gctggccgcc atcgcgcgac 

acgcgccgga cctgccttgc gggttcgaca tggacatcac cteggggate gatccgaggc 

tegggcttgg atcctcggcg gcggtgacgg tcgcctgcct cggcgcgctg tegeggctgg 7330 

eggggegggg gaccgagggg ctgcatgacg acgcgctgcg catcgtccgc gccatccagg 7390 

gcaggggcag eggggecgat ctggcggcca gectgeatgg cggcttcgtc gcctatcgcg 

cgcccgatgg cggtgccgcg cagatcgagg cgcttccggt gccgccgggg ccgttcggcc 



7450 
7510 
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tgcgctatgc 


gggctacaag 


accccgacag 


ccgaggtgct 


gcgccttgtg 


gccgatcgga 


tggcgggcaa 


cgaggccgct 


ttcgacgcgc 


tctactcccg 


gatgggcgca 


agcgcagatg 


ccgcgatccg 


cgcggcgcaa 


gggctggact 


gggctgcatt 


ccacgacgcg 


c tgaacgaat 


accagcgcct 


gatggagcag 


ctgggcgtgt 


ccgacgacac 


gctggacgcg 


atcatccgcg 


aggcgcgcga 


cgcgggcgcc 


gcagtcgcca 


agatctccgg 


ctcggggctg 


ggggattgcg 


tgctggcact 


gcgcgaccag 


cccaagggtt 


tcgtgcccgc 


aagcattgcc 


gagaagggac 


Ctgttttcga 


tgactgatgc 


cgtccgcgac 


atgatcgccc 


gtgccatggc 


gggcgcgacc 


gacatccgag 


cagccgaggc 


ttatgcgccc 


agcaacatcg 


cgctgtcgaa 


atactggggc 


aagcgcgacg 


ccgcgcggaa 


ccttccgctg 


aacagctccg 


tctcgatctc 


gttggcgaac 


tggggctctc 


atacgcgggt 


cgaggggtcc 


ggcacgggcc 


acgacgaggt 


gcatcacaac 


ggcacgctgc 


tggatccggg 


cgacgccttc 


gcgcgccgcg 


cgttggcatt 


cgctgacctg 


ttccgggggg 


ggaggcacct 


gccgctgcgg 


atcacgacgc 


agaactcgat 


cccgacggcg 


gcggggcttg 


cctcgtcggc 


ctcggggttc 


gcggcgctga 


cccgtgcgct 


ggcgggggcg 


ttcgggctgg 


atctggacga 


cacggatctg 


agccgcatcg 


cccggatcgg 


cagtggcagc 


gccgcccgct 


cgatctggca 


cggcttcgtc 


cgctggaacc 


ggggcgaggc 


cgaggatggg 


catgacagcc 


acggcgtccc 


gctggacctg 


cgctggcccg 


gcttccgcat 


cgcgatcgtg 


gccgtggaca 


aggggcccaa 


gcctttcagt 


tcgcgcgacg 


gcatgaacca 


cacggtcgag 


accagcccgc 


tgttcccgcc 


ctggcctgcg 


caggcggaag 


cggattgccg 


cgtcatcgag 


gatgcgatcg 


ccgcccgcga 


catggccgcc 


ctgggtccgc 


gggtcgaggc 


gaacgccctt 


gcgatgcacg 


ccacgatgat 


ggccgcgcgc 


ccgccgctct 


gctacctgac 


gggcggcagc 




c y y act y v_ l. 


ataacaaacc 


c crc crc cjcf a c cr 


qqcttgcggc 


c tttgcgacg 


atggatgccg 


gcccgaacgt 


caagctgatc 


ttcgaggaaa 


gcagcgccgc 


cgacgtgctg 


tacctgttcc 


ccgacgccag 


cctgatcgcg 


ccgttcgagg 


ggcgttgaac 


gcgtaagacg 


accactgggt 


aaggttctgc 


cgcgcgtggt 


ctcgactgcc 


tgcaaagagg 


tgcttgagtt 


gctgcgtgac 


tgcggcggcc 


gacttcgtgg 


gacttgcccg 


ccacgctgac 


gcgctggaaa 


cgcgcccgcg 


gattacgacc 


gcgtcattgc 


cc tgaaccaa 


tttcccgtcg 


gtcgac 



<210> 49 
<211> 378 



7570 

7630 

7690 

7750 

7810 

7870 

7930 

7990 

8050 

8110 

8170 

8230 

8290 

8350 

8410 

8470 

8530 

8590 

8650 

8710 

8770 

8830 

8890 

8950 

9010 

9066 
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<212> PRT 

<213> Paracoccus sp. R114 



<400> 49 

Met Ser Thr Gly Arg Pro Glu Ala Gly Ala His Ala Pro Gly Lys Leu 
15 10 15 

lie Leu Ser Gly Glu His Ser Val Leu Tyr Gly Ala Pro Ala Leu Ala 

20 25 30 

Met Ala lie Ala Arg Tyr Thr Glu Val Trp Phe Thr Pro Leu Gly lie { 
35 40 45 

Gly Glu Gly lie Arg Thr Thr Phe Ala Asn Leu Ser Gly Gly Ala Thr 
50 55 60 

Tyr Ser Leu Lys Leu Leu Ser Gly Phe Lys Ser Arg Leu Asp Arg Arg 
65 70 75 80 

Phe Glu Gin Phe Leu Asn Gly Asp Leu Lys Val His Lys Val Leu Thr 

85 90 95 

His Pro Asp Asp Leu Ala Val Tyr Ala Leu Ala Ser Leu Leu His Asp 

100 105 110 

Lys Pro Pro Gly Thr Ala Ala Met Pro Gly lie Gly Ala Met His His / 
115 120 125 ■ 

Leu Pro Arg Pro Gly Glu Leu Gly Ser Arg Thr Glu Leu Pro lie Gly 
130 135 140 

Ala Gly Met Gly Ser Ser Ala Ala lie Val Ala Ala Thr Thr Val Leu 
145 150 155 160 

Phe Glu Thr Leu Leu Asp Arg Pro Lys Thr Pro Glu Gin Arg Phe Asp 

165 170 175 

Arg Val Arg Phe Cys Glu Arg Leu Lys His Gly Lys Ala Gly Pro lie 

180 185 190 
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Asp Ala Ala Ser Val Val Arg Gly Gly Leu Val Arg Val Gly Gly Asn 
195 200 205 



Gly Pro Gly Ser lie Ser Ser Phe Asp Leu Pro Glu Asp His Asp Leu 
210 215 220 



Val Ala Gly Arg Gly Trp Tyr Trp Val Leu His Gly Arg Pro Val Ser 
225 230 235 240 



Gly Thr Gly Glu Cys Val Ser Ala Val Ala Ala Ala His Gly Arg Asp 

245 250 255 



Ala Ala Leu Trp Asp Ala Phe Ala Val Cys Thr Arg Ala Leu Glu Ala 

260 265 270 



Ala Leu Leu Ser Gly Gly Ser Pro Asp Ala Ala lie Thr Glu Asn Gin 
275 280 285 



Arg Leu Leu Glu Arg lie Gly Val Val Pro Ala Ala Thr Gin Ala Leu 
290 295 300 



Val Ala Gin lie Glu Glu Ala Gly Gly Ala Ala Lys lie Cys Gly Ala 
305 310 315 320 



Gly Ser Val Arg Gly Asp His Gly Gly Ala Val Leu Val Arg lie Asp 

325 330 335 



Asp Ala Gin Ala Met Ala Ser Val Met Ala Arg His Pro Asp Leu Asp 

340 345 350 



Trp Ala Pro Leu Arg Met Ser Arg Thr Gly Ala Ala Pro Gly Pro Ala 
355 360 365 



Pro Arg Ala Gin Pro Leu Pro Gly Gin Gly 
370 375 



<210> 50 

<211> 9066 

<212> DNA 

<213> Paracoccus sp. R114 
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<220> 

<221> CDS 

<222> (6970) . . (7887) 

< 2 2 3 > pmk 



<400> 50 
ggatccggca 


gctcgacacg 


ccgcagaacc 


tgtacgaacg 


ccccgccagc 


cgc ttcy ccg 




cggaattcgt 


cgggcgcggg 


acggtggtgc 


ccgtgcaggc 


ccatgacggc 


> » ^% **** 

^r c cfQr£f c c y c y 


iZU 


cccgcatcct 


gggggccgag 


gtggcggtga 


acgccgcccc 


gcaatcgcgc 


♦— y-r +- r"*n o +~ 

ll egtcgate 


J. O VJ 


acgtctgcct 


gcgccccgag 


aaccttgcca 


tctccgagac 


gggcgacctg 


cgcgccaagg 


9 A fi 
Z ft U 


tcgcgcgcgt 


cacctatctt 


ggcgggaaat 


acctgctgga 


aaccgcgccg 


gat cgc gg ca 


^ n n 


cccggctggt 


gaccgagacc 


cgcgcccgct 


tegataeggg 


cgcycagctL 


yyCLuyaULa 


6 n 

_> O \J 


tcaacgcccc 


ctgggccttt 


gccgaggatt 


gaatggacag 


cgcgaagacc 


t - f* f~ /~" rr/rrrr'a 

CLLi-cyyyta 


•a u 


tgggcgtgaa 


gggccctgcc 


tgcaccaggc 


cgy« tguegg 


c ggy dt g L y^ 




480 

T» *J v 


attgcgggac 


cggcccggac 


gagggcgcgg 


agt CCyaCCC 


eyectyyt- uy 


rrnrrrra r*n(~' net 
yuyyat-yoyy 


540 

mj m X \-r 


atgcggtgct 


gatcacccat 


gaccacgugg 


--j ^ —i 4~ =1 f fr (~T 

aCCataCCyg 


rf/T^rrrTrprrl" 


i^»2a r*rrr*€*ffTt" per 


600 

V-» \J %f 


cggcggggct 


gccgacccac 


gcgacgcggc 


aria PrtrfrTTnrt 


y L- i— y w w y w w 


cr c crcr era ctc crcr 
y *-y yyyy 


660 

w 


atctgcgcct 


gccgcccgaa 




<-y<- yy auL. y 




ctaacaacca 


720 


gtcgcaacgg 


gcatgccgcg 


ggeggegtet 


ggatgcattt 


cgacatgggc 


gaggggctgt 


780 


tctattccgg 


cgactggtcc 


gaggaatccg 


actggttcgc 


cttcgatccg 


cccccgcctg 


840 


cggggacggc 


gattctcgac 


tgctcctatg 


gcggtttcga 


cgtggcgcaa 


teggattgea 


900 


tcgcggacct 


ggacgacctg 


ctcgaggtgc 


tgeeggggea 


ggtactgctg 


ccggtgccgc 


960 


catccggccg 


cgcggccgag 


ctggccctgc 


ggctgatccg 


ccgccacgga 


ccgggcagcg 


1020 


tgatggtcga 


cgacgcctgc 


ctgccggcca 


tcgcgcaact 


gcccgaggcg 


cgcggactgg 


1080 


cctacgccac 


cgaggcacgc 


tttcttgtct 


gcgacacgcc 


gaacgccgaa 


agccggcgcg 


1140 


gcatggcggc 


atctgcaagc 


atggcgcgat 


gegggcagge 


tggggcggga 


cgcgcatgtc 


1200 


gtcttcaccg 


ggcacatgaa 


cgtccatgcg 


cgcgcattct 


gcgaccgccc 


eggegggcat 


1260 


ttccgccgct 


ggaacgtgca 


tccgccgctg 


cgcgaccagc 


gaeggatget 


ggaacggctg 


1320 


gccgcgcggc 


gctttgcccc 


ggccttctgc 


cccgaccccg 


agatctatct 


ggcgctggac 


1380 
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atgggcgcgc aggtcttcat gcaccaggag gtgacgccat gatccccgcc cgcagcttct 1440 
gcctgatccg ccacggcgaa acgaccgcca atgcaggggc gatcatcgcg ggcgcaaccg 1500 
atgtgcccct gacgccaagg ggccgcgatc aggcccgcgc cctggcaggg cgcgaatggc 1560 
catcgggcat cgcgctgttc gccagcccga tgtcgcgtgc ccgcgatacc gcgctgctgg 1620 
cctttccggg gcgcgaccac cagcccgaac ccgatctgcg cgaacgcgac tggggcatct 1680 
tcgagggacg ccccgtcgcc gatctgcccc cgcgcgaaat cacgccgcag gggggcgagg 1740 
gctgggacga cgtgatggcc cgcgtggacc gcgcgatccg gcggatctgc gcgacctcgg 1800 
gcgatgcgct gccggtgctg gtctgccatt cgggcgtgat ccgtgccgcg cgcgtgctgt 1860 

» 

ggaccaccgg cgatgcgggc gatcgtccgc ccaacgccac gccgatcctg ttcagcccgg 1920 
acggcgaccg attaaaggaa ggaacgatat gaccgccacc accccctgcg tcgtcttcga 1980 
acgtggacgg cacgcttgcc gaattcgacg ccgaccgcct gggccatctt gtccacggca 2040 

cgaccaagca ctgggacgcc ttccaccacg cgatggccga cgccccgccc atccccgagg 2100 

tcgcccgcct gatgcgcaag ctgaaggagg ggggcgagac ggtcgtcatc tgctcggggc 2160 

ggccccgcgg ctggcaggat cagacgatcg catggctgcg caagcacgac ctgcccttcg 222 0 

acgggatcta tctgcgcccc gaggatcagg acggcgccag cgaccccgag gtcaagcgcc 2280 

gcgccctagc cgagatgcgc gccgacgggc tggcgccctg gctggtcgtg gacgaccggc 2340 

ggtccgtcgt ggatgcctgg cgggccgagg ggctggtctg cctgcaatgc gcgccggggg 2400 

acttctaggg ccgcgcgacg ggggcgcgga caggctgggc gggaaaccgc cccgccacca 2460 

tgtcctgcac gcgtcgaacc gcccgtccga cgccggtttc cgcacggaaa cgcgcggcaa 252 0 

gttgacataa cttgcacgcg acgtctcgat tctgcccgcg aagaatgcga tgcatccaga 2580 

tgatgcagaa cgaagaagcg gaagcgcccg tgaaagacca gatgatttcc cataccccgg 2640 

tgcccacgca atgggtcggc ccgatcctgt tccgcggccc cgtcgtcgag ggcccgatca 2700 

gcgcgccgct ggccacctac gagacgccgc tctggccctc gaccgcgcgg ggggcagggg 2760 

tttcccggca ttcgggcggg atccaggtct cgctggtcga cgaacgcatg agccgctcga 2820 

tcgcgctgcg ggcgcatgac ggggcggcgg cgaccgccgc ctggcagtcg atcaaggccc 2880 



gccaggaaga ggtcgcggcc gtggtcgcca ccaccagccg cttcgcccgc cttgtcgagc 2940 



tgaatcgcca gatcgtgggc aacctgcttt acatccgcat cgaatgcgtg acgggcgacg 



3000 



cctcgggtca caacatggtc accaaggccg ccgaggccgt gcagggctgg atcctgtcgg 3060 
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aatacccaat 


getggectat 

^» — 1 ^1 


tccacgatct 


eggggaaect 


gtgcaccgac 


aagaaggcgt 


3120 


raacaatcaa 


ccracatcc tg 


ggccgcggca 


aatacgccgt 


cgccgaggtc 


gagatcccgc 


3180 


acaaoatcct 


gacccgcgtg 


ctgcgcacca 


gcgccgagaa 


gatggtcege 


ctgaactacg 


3240 


^craaaaacta 


tategggggt 


aegctggegg 


ggtcgctgcg 


cagtgegaac 


gegcattteg 


3300 


L>L>OClv« CA V_ y v — l_ 


actaaacttc 


tacctggega 


eggggcagga 


cgcggccaac 


atcatcgagg 


3360 




ct~ tratccat 


tacaaacjccc 


gcggcgagga 


tctgtatttc 


tcgtgcacgc 


3420 




r* a. t* crcr ere 


tcaatccrcrtcr 

^» ^3 ^3 v»> ^3 w 


ccggcaaggg 


catcccctcg 


atcgaggaga 


3480 


a.c.c cy l. t»y y 




cere caoc cqq 


gcgaacccgg 


cgacaacgcg 


cgccgtcttg 


3540 


c n Pi t* f~ <T 


r* ere cr crcrc ci t c 


gtgctgtgtg 


gtgaattgtc 


getgettgeg 


gcccagacca 


3600 


dL.<-UL.y yoy a 


crfc fcerert ccac 
y >— ^ y y ^ w wy v» 


acccacatgg 


agatggagcg 


atgaccgaca 


gcaaggatca 


3660 


t~ rrt~ cried 
LLaty L- s^y »w y 


crcrcrccrcaacrc 
yyy^*y v, * uu y v 


taaaccatct 


gcgtgcattg 


gacgacgatg 


eggatatega 


3720 


ccggyy^yau 


Q y^yy* 


accacatccrc 


gc tgacccat 


cgcgccctgc 


ccgaggtgga 


3780 




ex. L.uy uuouy y 


ccraccaoc 1 1 


cctgggccgt 


gaactgtcct 


tcccgctgct 


3840 


y duo ucy 


o t- cracccrcrccf 

d «— y c* . v_» y y v- » y 


gcaccggcga 


ggagatcgag 


cgcatcaacc 


gcaacctggc 


3900 




eracrcraeTCrcCC 

c*y y oy y w ^ — v»» 


accftccrccat 


ggcggtgggc 


tcgcagcgcg 


tgatgttcac 


3960 


LyaLucc uuy 


y^y^yyy v- 


acttcoacct 


gcgcgcccat 


gcgcccaccg 


tgccgctgct 


4020 


ciac f 2* st f* i" f" 
y y t» au i_ q i_ Vw 


crcr c* cr c cr cr t a c 

yy^y^yy v»yw 


age tgaacat 


ggggctgggg 


ctgaaggaat 


gcctggccgc 


4080 


yaiuy ayy v- y 


ctcjcaorcrcQCf 


acggcctgta 


tctgcacctg 


aaccccctgc 


aagaggccgt 


4140 


n*ay v.. y c*y 


aaaaatccrccr 


act ttgeega 


tctgggcagc 


aagatcgegg 


ccatcgcccg 


4200 


c ci p\c* ci t~ tccc 


crtcrcccQtcc 


tgctgaagga 


ggtgggctgc 


ggcctgtcgg 


eggecgatat 


4260 


rcrrrai" ccrcrcr 

*— y WW CA L— ^3 


ct.cfcgcaccg 


ggatceggea 


tttcgacgtg 


gccggtcgcg 


gcggcacatc 


4320 




atcgagtatc 


gccgccgcca 


gegggecgat 


gacgacctgg 


gectggtett 


4380 




aacctcfcacra 


ccgtggacgc 


cctgcgcgag 

TILT* ' 


gcgcggcccg 


cgcttgcggc 


4440 


c c* a t era t crcr 3. 


accagcgtgc 


tgatcgccag 


cggcggcatc 


cgcaacggtg 


tcgacatggc 


4500 


gaaatgcgtc 


atcctggggg 


ccgacatgtg 


cggggtcgcc 


gcgcccccgc 


egaaagegge 


A ^ G. fi 
*i O O U 


ccaaaactcg 


cgegaggegg 


ttgtatccgc 


catceggaaa 


ctgcatctgg 


agttccggac 


4620 


agccatgttc 


ctcctgggtt 


gcggcacgct 


tgccgacctg 


aaggacaatt 


cctcgcttat 


4680 


ccgtcaatga 


aagtgcctaa 


gatgacegtg 


acaggaatcg 


aagcgatcag 


cttctacacc 


4740 


ccccagaact 


acgtgggact 


ggatatcctt 


gccgcgcatc 


aegggatega 


ccccgagaag 


4800 
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ttctcgaagg ggatcgggca ggagaaaatc gcactgcccg gccatgacga ggatatcgtg 4860 

accatggccg ccgaggccgc gctgccgatc atcgaacgcg cgggcacgca gggcatcgac 4920 

acggttctgt tcgccaccga gagcgggatc gaccagtcga aggccgccgc catctatctg 4980 

cgccgcctgc tggacctgtc gcccaactgc cgttgcgtcg agctgaagca ggcctgctat 5040 

tccgcgacgg cggcgctgca gatggcctgc gcgcatgtcg cccgcaagcc cgaccgcaag 5100 

gtgctggtga tcgcgtccga tgtcgcgcgc tatgaccgcg aaagctcggg cgaggcgacg 5160 

cagggtgcgg gcgccgtcgc catccttgtc agcgccgatc ccaaggtggc cgagatcggc 5220 

accgtctcgg ggctgttcac cgaggatatc atggatttct ggcggccgaa ccaccgccgc 5280 

; acgcccctgt tcgacggcaa ggcatcgacg ctgcgctatc tgaacgcgct ggtcgaggcg 5340 

tggaacgact atcgcgcgaa tggcggccac gagttcgccg atttcgcgca tttctgctat 5400 

cacgtgccgt tctcgcggat gggcgagaag gcgaacagcc acctggccaa ggcgaacaag 5460 

acgccggtgg acatggggca ggtgcagacg ggcctgatct acaaccggca ggtcgggaac 5520 

tgctataccg ggtcgatcta cctggcattc gcctcgctgc tggagaacgc tcaggaggac 5580 

ctgaccggcg cgctggtcgg tctgttcagc tatggctcgg gtgcgacggg cgaattcttc 5640 

gatgcgcgga tcgcgcccgg ttaccgcgac cacctgttcg cggaacgcca tcgcgaattg 5700 

ctgcaggatc gcacgcccgt cacatatgac gaatacgttg ccctgtggga cgagatcgac 5760 

ctgacgcagg gcgcgcccga caaggcgcgc ggtcgtttca ggctggcagg tatcgaggac 5820 

gagaagcgca tctatgtcga ccggcaggcc tgaagcaggc gcccatgccc cgggcaagct 5880 

£ gatcctgtcc ggggaacatt ccgtgctcta tggtgcgccc gcgcttgcca tggccatcgc 5940 

ccgctatacc gaggtgtggt tcacgccgct tggcattggc gaggggatac gcacgacatt 6000 

cgccaatctc tcgggcgggg cgacctattc gctgaagctg ctgtcggggt tcaagtcgcg 6060 

gctggaccgc cggttcgagc agttcctgaa cggcgaccta aaggtgcaca aggtcctgac 612 0 

ccatcccgac gatctggcgg tctatgcgct ggcgtcgctt ctgcacgaca agccgccggg 6180 

gaccgccgcg atgccgggca tcggcgcgat gcaccacctg ccgcgaccgg gtgagctggg 624 0 

cagccggacg gagctgccca tcggcgcggg catggggtcg tctgcggcca tcgtcgcggc 6300 

caccacggtc ctgttcgaga cgctgctgga ccggcccaag acgcccgaac agcgcttcga 636 0 

ccgcgtccgc ttctgcgagc ggttgaagca cggcaaggcc ggtcccatcg acgcggccag 642 0 

cgtcgtgcgc ggcgggcttg tccgcgtggg cgggaacggg ccgggttcga tcagcagctt 6480 
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cgatttgccc 


gaggatcacg 


accttgtcgc 


gggacgcggc 


tggtactggg 


gcgccccgtc 


agcgggaccg 


gcgaatgcgt 


cagcgcggtc 


gcggcggcgc 


tgcggcgctg 


tgggacgcct 


tcgcagtctg 


cacccgcgcg 


ttggaggccg 


tgggggcagc 


cccgacgccg 


ccatcaccga 


gaaccagcgc 


ctgctggaac 


cgtgccggca 


gcgacgcagg 


ccctcgtggc 


ccagatcgag 


gaggcgggtg 


gatctgcggc 


gcaggttccg 


tgcggggcga 


tcacggcggg 


gcggtcctcg 


cgacgcgcag 


gcgatggctt 


cggtcatggc 


gcgccatccc 


gacctcgact 


gcgcatgtcg 


cgcacggggg 


cggcacccgg 


ccccgcgccg 


cgtgcgcaac 



gcagggctg atg gat cag gtc ate cgc gec age gcg ccg ggt teg gtc atg 

Met Asp Gin Val He Arg Ala Ser Ala Pro Gly Ser Val Met 
15 10 

ate acg ggc gaa cat gee gtg gtc tat gga cac cgc gee ate gtc gee 

He Thr Gly Glu His Ala Val Val Tyr Gly His Arg Ala He Val Ala 

15 20 25 30 

ggg ate gag cag cgc gee cat gtg acg ate gtc ccg cgt gee gac cgc 

Gly He Glu Gin Arg Ala His Val Thr He Val Pro Arg Ala Asp Arg 

35 40 45 

atg ttt cgc ate acc teg cag ate ggg gcg ccg cag cag ggg teg ctg 

Met Phe Arg He Thr Ser Gin He Gly Ala Pro Gin Gin Gly Ser Leu 

50 55 60 

gac gat ctg cct gcg ggc ggg acc tat cgc ttc gtg ctg gee gee ate 

Asp Asp Leu Pro Ala Gly Gly Thr Tyr Arg Phe Val Leu Ala Ala He 
65 70 75 

gcg cga cac gcg ccg gac ctg cct tgc ggg ttc gac atg gac ate acc 

Ala Arg His Ala Pro Asp Leu Pro Cys Gly Phe Asp Met Asp He Thr 
80 85 90 

teg ggg ate gat ccg agg etc ggg ctt gga tec teg gcg gcg gtg acg 

Ser Gly He Asp Pro Arg Leu Gly Leu Gly Ser Ser Ala Ala Val Thr 

95 100 105 HO 

gtc gee tgc etc ggc gcg ctg teg egg ctg gcg ggg egg ggg acc gag 

Val Ala Cys Leu Gly Ala Leu Ser Arg Leu Ala Gly Arg Gly Thr Glu 

115 120 125 

ggg ctg cat gac gac gcg ctg cgc ate gtc cgc gee ate cag ggc agg 

Gly Leu His Asp Asp Ala Leu Arg He Val Arg Ala He Gin Gly Arg 

130 135 140 

ggc age ggg gee gat ctg gcg gee age ctg cat ggc ggc ttc gtc gee 

Gly Ser Gly Ala Asp Leu Ala Ala Ser Leu His Gly Gly Phe Val Ala 
145 150 155 

tat cgc gcg ccc gat ggc ggt gee gcg cag ate gag gcg ctt ccg gtg 



6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7011 



7059 



7107 



7155 



7203 



7251 



7299 



7347 



7395 



7443 



( 1 



7491 
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Tyr Arg Ala Pro Asp Gly Gly Ala Ala Gin lie Glu Ala Leu Pro Val 
160 165 170 



ccg ccg ggg ccg ttc ggc ctg cgc tat gcg ggc tac aag acc ccg aca 
Pro Pro Gly Pro Phe Gly Leu Arg Tyr Ala Gly Tyr Lys Thr Pro Thr 
175 180 185 190 



7539 



gcc gag gtg ctg cgc ctt gtg gcc gat egg atg gcg ggc aac gag gec 
Ala Glu Val Leu Arg Leu Val Ala Asp Arg Met Ala Gly Asn Glu Ala 

195 - 200 205 



7587 



get ttc gac gcg etc tac tec egg atg ggc gca age gca gat gcc gcg 
Ala Phe Asp Ala Leu Tyr Ser Arg Met Gly Ala Ser Ala Asp Ala Ala 

210 215 220 



7635 



c 



ate cgc gcg gcg caa ggg ctg gac tgg get gca ttc cac gac gcg ctg 
lie Arg Ala Ala Gin Gly Leu Asp Trp Ala Ala Phe His Asp Ala Leu 
225 230 235 

aac gaa tac cag cgc ctg atg gag cag ctg ggc gtg tec gac gac acg 
Asn Glu Tyr Gin Arg Leu Met Glu Gin Leu Gly Val Ser Asp Asp Thr 
240 245 250 



7683 



7731 



ctg gac gcg ate ate cgc gag gcg cgc gac gcg ggc gcc gca gtc gcc 
Leu Asp Ala lie lie Arg Glu Ala Arg Asp Ala Gly Ala Ala Val Ala 
255 260 265 270 



7779 



aag ate tec ggc teg ggg ctg ggg gat tgc gtg ctg gca ctg ggc gac 
Lys He Ser Gly Ser Gly Leu Gly Asp Cys Val Leu Ala Leu Gly Asp 

275 280 285 



7827 



cag ccc aag ggt ttc gtg ccc gca age att gcc gag aag gga ctt gtt 
Gin Pro Lys Gly Phe Val Pro Ala Ser He Ala Glu Lys Gly Leu Val 

290 295 300 



7875 



( 



ttc gat gac tga tgccgtccgc gacatgatcg cccgtgccat ggcgggcgcg 
Phe Asp Asp 
305 



7927 



accgacatcc 


gagcagcega 


ggcttatgcg 


cccagcaaca 


tcgcgctgtc 


gaaatactgg 


7987 


ggcaagcgcg 


acgccgcgcg 


gaaccttccg 


ctgaacagct 


ccgtctcgat 


ctcgttggcg 


8047 


aactggggct 


ctcatacgcg 


ggtcgagggg 


tccggcacgg 


gccacgacga 


ggtgcatcac 


8107 


aacggcacgc 


tgctggatcc 


gggcgacgcc 


ttcgcgcgcc 


gcgcgttggc 


attegctgae 


8167 


ctgttccggg 


gggggaggca 


cctgccgctg 


eggatcaega 


cgcagaactc 


gatcccgacg 


8227 


geggegggge 


ttgcctcgtc 


ggcctcgggg 


t tcgcggcgc 


tgacccgtgc 


gctggcgggg 


8287 


gcgttcgggc 


tggatctgga 


egacaeggat 


ctgagccgca 


tcgcccggat 


cggcagtggc 


8347 


agcgccgccc 


gctcgatctg 


gcacggcttc 


gtccgctgga 


aceggggega 


ggecgaggat 


8407 


gggcatgaca 


gccacggcgt 


cccgctggac 


ctgcgctggc 


ccggcttccg 


catcgcgatc 


8467 



53 



BNSDOCID: <WO 0209909 5A2J_> 



WO 02/099095 



PCT/EP02/06171 



gtggccgtgg 


acaaggggcc 


caagcctttc 


agttcgcgcg 


acggcatgaa 


ccacacggtc 


8527 


gagaccagcc 


cgctgttccc 


gccctggcct 


gcgcaggcgg 


aagcggattg 


ccgcgtcatc 


8587 


gaggatgcga 


tcgccgcccg 


cgacatggcc 


gccctgggtc 


cgcgggtcga 


ggcgaacgcc 


8647 


cttgcgatgc 


acgccacgat 


gatggccgcg 


cgcccgccgc 


tctgctacct 


gacgggcggc 


8707 


agctggcagg 


tgctggaacg 


cctgtggcag 


gcccgcgcgg 


acgggcttgc 


ggcctttgcg 


8767 


acgatggatg 


ccggcccgaa 


cgtcaagctg 


atcttcgagg 


aaagcagcgc 


cgccgacgtg 


8827 


ctgtacctgt 


tccccgacgc 


cagcctgatc 


gcgccgttcg 


aggggcgttg 


aacgcgtaag 


8887 


acgaccactg 


ggtaaggttc 


tgccgcgcgt 


ggtctcgact 


gcctgcaaag 


aggtgcttga 


8947 


gttgctgcgt 


gactgcggcg 


gccgacttcg 


tgggacttgc 


ccgccacgct 


gacgcgctgg 


9007 


aaacgcgccc 


gcggattacg 


accgcgtcat 


tgccctgaac 


caatttcccg 


tcggtcgac 


9066 



<210> 51 
<211> 305 
<212> PRT 

<213> Paracoccus sp . R114 
<400> 51 

Met Asp Gin Val He Arg Ala Ser Ala Pro Gly Ser Val Met He Thr 
15 10 15 

Gly Glu His Ala Val Val Tyr Gly His Arg Ala He Val Ala Gly He 

20 25 30 

Glu Gin Arg Ala His Val Thr He Val Pro Arg Ala Asp Arg Met Phe 
35 40 45 

Arg He Thr Ser Gin He Gly Ala Pro Gin Gin Gly Ser Leu Asp Asp 
50 55 60 

Leu Pro Ala Gly Gly Thr Tyr Arg Phe Val. Leu Ala Ala He Ala Arg 
65 70 75 80 

His Ala Pro Asp Leu Pro Cys Gly Phe Asp Met Asp He Thr Ser Gly 

85 90 95 
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c 



He Asp Pro Arg Leu Gly Leu Gly Ser Ser Ala Ala Val Thr Val Ala 

100 105 110 



Cys Leu Gly Ala Leu Ser Arg Leu Ala Gly Arg Gly Thr Glu Gly Leu 
115 120 125 



His Asp Asp Ala Leu Arg He Val Arg Ala He Gin Gly Arg Gly Ser 
130 135 140 



Gly Ala Asp Leu Ala Ala Ser Leu His Gly Gly Phe Val Ala Tyr Arg 
145 150 155 160 



Ala Pro Asp Gly Gly Ala Ala Gin He Glu Ala Leu Pro Val Pro Pro 

165 170 175 



Gly Pro Phe Gly Leu Arg Tyr Ala Gly Tyr Lys Thr Pro Thr Ala Glu 

180 185 190 



Val Leu Arg Leu Val Ala Asp Arg Met Ala Gly Asn Glu Ala Ala Phe 
195 200 205 



Asp Ala Leu Tyr Ser Arg Met Gly Ala Ser Ala Asp Ala Ala He Arg 
210 215 220 



Ala Ala Gin Gly Leu Asp Trp Ala Ala Phe His Asp Ala Leu Asn Glu 
225 230 235 240 



Tyr Gin Arg Leu Met Glu Gin Leu Gly Val Ser Asp Asp Thr Leu Asp 

245 250 255 



Ala He He Arg Glu Ala Arg Asp Ala Gly Ala Ala Val Ala Lys He 

260 265 270 



Ser Gly Ser Gly Leu Gly Asp Cys Val Leu Ala Leu Gly Asp Gin Pro 
275 280 285 



Lys Gly Phe Val Pro Ala Ser He Ala Glu Lys Gly Leu Val Phe Asp 
290 295 300 



Asp 
305 



<210> 52 
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<211> 9066 
<212> DNA 

<213> Paracoccus sp . R114 



<220> 

<221> CDS 

<222> (7880) . . (8878) 
<223> mvd 

i. 



<400> 52 
aaatccocica 

^ «-» V«- ^» ^» 55 ^» v» 


gctcgacacg 


ccgcagaacc 


tgtacgaacg 


tcccgccagc 


cgcttcgtcg 


60 


caaaattcqt 


cgggcgcggg 


acggtggtgc 


ccgtgcaggc 


ccatgacggc 


gcgggccgcg 


120 


cccacatcct 


gggggccgag 


gtggcggtga 


acgccgcccc 


gcaatcgcgc 


tttgtcgatc 


180 


acgtctgcct 


gcgccccgag 


aaccttgcca 


tctccgagac 


gggcgacctg 


cgcgccaagg 


240 


tcgcgcgcgt 


cacctatctt 


ggcgggaaat 


acc tgc tgga 


aaccgcgc eg 


gattycgycd 


"\C\C\ 


cccggctggt 


gaccgagacc 


cgcgcccgct 


tcgatacggg 


cgcgcagctt 


ggcctgacca 


360 


tcaacgcccc 


ctgggccttt 


gccgaggatt 


gaatggacag 


cgtgaagatc 


etttegggea 


420 


tgggcgtgaa 


gggccctgcc 


tgcatcaggc 


tggatgtcgg 


egggatgege 


ctgatcctcg 


480 


attgcgggac 


cggcccggac 


gagggcgcgg 


agttcgaccc 


cgcctggctg 


geggacgegg 


540 


atgcggtgct 


gatcacccat 


gaccacgtgg 


accatatcgg 


cggcgcgcgt 


cacgcggtcg 


600 


cggcggggct 


gccgatccat 


gcgacgcggc 


agacggcggg 


gttgctgccc 


gegggggegg 


660 


atctgcgcct 


gctgcccgaa 


cgcggtgtca 


cgcggatcgc 


eggggtcgat 


ctgacgaccg 


720 


gtcgcaacgg 


gcatgccgcg 


ggcggcgtct 


ggatgcattt 


cgacatgggc 


gaggggctgt 


780 


tctattccgg 


cgactggtcc 


gaggaatccg 


actggttcgc 


cttcgatccg 


cccccgcctg 


840 


cggggacggc 


gattctcgac 


tgctcctatg 


gcggtttcga 


cgtggcgcaa 


teggattgea 


900 


tcgcggacct 


ggacgacctg 


ctcgaggtgc 


tgccggggca 


ggtactgctg 


ccggtgccgc 


960 


catccggccg 


cgcggccgag 


ctggccctgc 


ggctgatccg 


ccgccacgga 


ccgggcagcg 


1020 


tgatggtcga 


cgacgcctgc 


ctgccggcca 


tcgcgcaact 


gcccgaggcg 


cgcggactgg 


1080 


cctacgccac 


cgaggcacgc 


tttcttgtct 


gcgacacgcc 


gaacgccgaa 


agccggcgcg 


1140 
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gcatggcggc 


atctgcaagc 


atggcgcgat 


gcgggcaggc 


tggggcggga 


cgcgcatgtc 


1200 


gtcttcaccg 


ggcacatgaa 


cgtccatgcg 


cgcgcattct 


gcgaccgccc 


cggcgggcat 


1260 


ttccgccgct 


ggaacgtgca 


tccgccgctg 


cgcgaccagc 


gacggatgct 


ggaacggctg 


1320 


gccgcgcggc 


gctttgcccc 


ggccttctgc 


cccgaccccg 


agatctatct 


ggcgctggac 


1380 


atgggcgcgc 


aggtcttcat 


gcaccaggag 


gtgacgccat 


gatccccgcc 


cgcagcttct 


1440 


gcctgatccg 


ccacggcgaa 


acgaccgcca 


atgcaggggc 


gatcatcgcg 


ggcgcaaccg 


1500 


atgtgcccct 


gacgccaagg 


ggccgcgatc 


aggcccgcgc 


cctggcaggg 


cgcgaatggc 


1560 


catcgggcat 


cgcgctgttc 


gccagcccga 


tgtcgcgtgc 


ccgcgatacc 


gcgctgctgg 


1620 


cctttccggg 


gcgcgaccac 


cagcccgaac 


ccgatctgcg 


cgaacgcgac 


tggggcatct 


1680 


tcgagggacg 


ccccgtcgcc 


gatctgcccc 


cgcgcgaaat 


cacgccgcag 


gggggcgagg 


1740 


gctgggacga 


cgtgatggcc 


cgcgtggacc 


gcgcgatccg 


gcggatctgc 


gcgacctcgg 


1800 


gcgatgcgct 


gccggtgctg 


gtctgccatt 


cgggcgtgat 


ccgtgccgcg 


cgcgtgctgt 


1860 


ggaccaccgg 


cgatgcgggc 


gatcgtccgc 


ccaacgccac 


gccgatcctg 


ttcagcccgg 


1920 


acggcgaccg 


attaaaggaa 


ggaacgatat 


gaccgccacc 


accccctgcg 


tcgtcttcga 


1980 


acgtggacgg 


cacgcttgcc 


gaattcgacg 


ccgaccgcct 


gggccatctt 


gtccacggca 


2040 


cgaccaagca 


ctgggacgcc 


ttccaccacg 


cgatggccga 


cgccccgccc 


atccccgagg 


2100 


tcgcccgcct 


gatgcgcaag 


ctgaaggagg 


ggggcgagac 


ggtcgtcatc 


tgctcggggc 


2160 


ggccccgcgg 


ctggcaggat 


cagacgatcg 


catggctgcg 


caagcacgac 


ctgcccttcg 


2220 


acgggatcta 


tctgcgcccc 


gaggatcagg 


acggcgccag 


cgaccccgag 


gtcaagcgcc 


2280 


gcgccctagc 


cgagatgcgc 


gccgacgggc 


tggcgccctg 


gctggtcgtg 


gacgaccggc 


2340 


ggtccgtcgt 


ggatgcctgg 


cgggccgagg 


ggctggtctg 


cctgcaatgc 


gcgccggggg 


2400 


acttctaggg 


ccgcgcgacg 


ggggcgcgga 


caggctgggc 


gggaaaccgc 


cccgccacca 


2460 


eg tec cgcac 


gcg ccgaacc 


gcccgcccga 


cgccggtccc 


cgcacggaaa 


cgcgcggcaa 


2520 


gttgacataa 


cttgcacgcg 


acgtctcgat 


tctgcccgcg 


aagaatgcga 


tgcatccaga 


2580 


tgatgcagaa 


cgaagaagcg 


gaagcgcccg 


tgaaagacca 


gatgatttcc 


cataccccgg 


2640 


tgcccacgca 


atgggtcggc 


ccgatcctgt 


tccgcggccc 


cgtcgtcgag 


ggcccgatca 


2700 


gcgcgccgct 


ggccacctac 


gagacgccgc 


tctggccctc 


gaccgcgcgg 


ggggcagggg 


2760 


t ttcccggca 


ttcgggcggg 


atccaggtct 


cgctggtcga 


cgaacgcatg 


agccgctcga 


2820 
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tcgcgctgcg ggcgcatgac ggggcggcgg cgaccgccgc ctggcagtcg atcaaggccc 2880 

gccaggaaga ggtcgcggcc gtggtcgcca ccaccagccg cttcgcccgc cttgtcgagc 2 940 

tgaatcgcca gatcgtgggc aacctgcttt acatccgcat cgaatgcgtg acgggcgacg 3000 

cctcgggtca caacatggtc accaaggccg ccgaggccgt gcagggctgg atcctgtcgg 3 060 

aatacccgat gctggcctat tccacgatct cggggaacct gtgcaccgac aagaaggcgt 312 0 

cggcggtcaa cggcatcctg ggccgcggca aatacgccgt cgccgaggtc gagatcccgc 3180 

gcaagatcct gacccgcgtg ctgcgcacca gcgccgagaa gatggtccgc ctgaactacg 3240 

agaagaacta tgtcgggggt acgctggcgg ggtcgctgcg cagtgcgaac gcgcatttcg 3300 

ccaacatgct gctgggcttc tacctggcga cggggcagga cgcggccaac atcatcgagg 3360 

ccagccaggg cttcgtccat tgcgaggccc gcggcgagga tctgtatttc tcgtgcacgc 3 420 

tgcccaacct catcatgggc tcggtcggtg ccggcaaggg catcccctcg atcgaggaga 3480 

acctgtcgcg gatgggctgc cgccagccgg gcgaacccgg cgacaacgcg cgccgtcttg 3 540 

cggcgatctg cgcgggcgtc gtgctgtgtg gtgaattgtc gctgcttgcg gcccagacca 3600 

accccggaga gttggtccgc acccacatgg agatggagcg atgaccgaca gcaaggatca 3 660 

ccatgtcgcg gggcgcaagc tggaccatct gcgtgcattg gacgacgatg cggatatcga 3720 

ccggggcgac agcggcttcg accgcatcgc gctgacccat cgcgccctgc ccgaggtgga 3780 

tttcgacgcc atcgacacgg cgaccagctt cctgggccgt gaactgtcct tcccgctgct 3840 

gatctcgtcc atgaccggcg gcaccggcga ggagatcgag cgcatcaacc gcaacctggc 3900 

cgctggtgcc gaggaggccc gcgtcgccat ggcggtgggc tcgcagcgcg tgatgttcac 3960 

cgacccctcg gcgcgggcca gcttcgacct gcgcgcccat gcgcccaccg tgccgctgct 4020 

ggccaatatc ggcgcggtgc agctgaacat ggggctgggg ctgaaggaat gcctggccgc 4080 

gatcgaggtg ctgcaggcgg acggcctgta tctgcacctg aaccccctgc aagaggccgt 4140 

ccagcccgag ggggatcgcg actttgccga tctgggcagc aagatcgcgg ccatcgcccg 4200 

cgacgttccc gtgcccgtcc tgctgaagga ggtgggctgc ggcctgtcgg cggccgatat 4260 

cgccatcggg ctgcgcgccg ggatccggca tttcgacgtg gccggtcgcg gcggcacatc 4320 

ctggagccgg atcgagtatc gccgccgcca gcgggccgat gacgacctgg gcctggtctt 43 80 

ccaggactgg ggcctgcaga ccgtggacgc cctgcgcgag gcgcggcccg cgcttgcggc 4440 

ccatgatgga accagcgtgc tgatcgccag cggcggcatc cgcaacggtg tcgacatggc 4500 

gaaatgcgtc atcctggggg ccgacatgtg cggggtcgcc gcgcccctgc tgaaagcggc 4560 
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ccaaaactcg 


cgcgaggcgg 


ttgtatccgc 


catccggaaa 


ctgcatctgg 


agttccggac 


4620 


agccatgttc 


ctcctgggtt 


gcggcacgct 


tgccgacctg 


aaggacaatt 


cctcgcttat 


4680 


ccgtcaatga 


aagtgcctaa 


gatgaccgtg 


acaggaatcg 


aagcgatcag 


cttctacacc 


4740 


ccccagaact 


acgtgggact 


ggatatcctt 


gccgcgcatc 


acgggatcga 


ccccgagaag 


4800 


ttctcgaagg 


ggatcgggca 


ggagaaaatc 


gcactgcccg 


gccatgacga 


ggatatcgtg 


4860 


accatggccg 


ccgaggccgc 


gctgccgatc 


atcgaacgcg 


cgggcacgca 


gggcatcgac 


4920 


acggttctgt 


tcgccaccga 


gagcgggatc 


gaccagtcga 


aggccgccgc 


catctatctg 


4980 


cgccgcctgc 


tggacctgtc 


gcccaactgc 


cgttgcgtcg 


agctgaagca 


ggcctgctat 


5040 


tccgcgacgg 


cggcgctgca 


gatggcctgc 


gcgcatgtcg 


cccgcaagcc 


cgaccgcaag 


5100 


gtgctggtga 


tcgcgtccga 


tgtcgcgcgc 


tatgaccgcg 


aaagctcggg 


cgaggcgacg 


5160 


cagggtgcgg 


gcgccgtcgc 


catccttgtc 


agcgccgatc 


ccaaggtggc 


cgagatcggc 


5220 


accgtctcgg 


ggctgttcac 


cgaggatatc 


atggatttct 


ggcggccgaa 


ccaccgccgc 


5280 


acgcccctgt 


tcgacggcaa 


ggcatcgacg 


ctgcgctatc 


tgaacgcgct 


ggtcgaggcg 


5340 


tggaacgact 


atcgcgcgaa 


tggcggccac 


gagttcgccg 


atttcgcgca 


tttctgctat 


5400 


cacgtgccgt 


tctcgcggat 


gggcgagaag 


gcgaacagcc 


acctggccaa 


ggcgaacaag 


5460 


acgccggtgg 


acatggggca 


ggtgcagacg 


ggcctgatct 


acaaccggca 


ggtcgggaac 


5520 


tgctataccg 


ggtcgatcta 


cctggcattc 


gcctcgctgc 


tggagaacgc 


tcaggaggac 


5580 


ctgaccggcg 


cgctggtcgg 


tctgttcagc 


tatggctcgg 


gtgcgacggg 


cgaattcttc 


5640 


gatgcgcgga 


tcgcgcccgg 


ttaccgcgac 


cacctgttcg 


cggaacgcca 


tcgcgaattg 


5700 


ctgcaggatc 


gcacgcccgt 


cacatatgac 


gaatacgttg 


ccctgtggga 


cgagatcgac 


5760 


ctgacgcagg 


gcgcgcccga 


caaggcgcgc 


ggtcgtttca 


ggctggcagg 


tatcgaggac 


5820 


gagaagcgca 


tctatgtcga 


ccggcaggcc 


tgaagcaggc 


gcccatgccc 


cgggcaagct 


5880 


gatcctgtcc 


ggggaacatx 


ccgcgcccca 


cggtgcgccc 


gcgcccgcca 


tggccaccgc 




ccgctatacc 


gaggtgtggt 


tcacgccgct 


tggcattggc 


gaggggatac 


gcacgacatt 


6000 


cgccaatctc 


tcgggcgggg 


cgacctattc 


gctgaagctg 


ctgtcggggt 


tcaagtcgcg 


6060 


gctggaccgc 


cggttcgagc 


agttcctgaa 


cggcgaccta 


aaggtgcaca 


aggtcctgac 


6120 


ccatcccgac 


gatctggcgg 


tctatgcgct 


ggcgtcgctt 


ctgcacgaca 


agccgccggg 


6180 


gaccgccgcg 


atgccgggca 


tcggcgcgat 


gcaccacctg 


ccgcgaccgg 


gtgagctggg 


6240 
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cagccggacg 


gagctgccca 


tcggcgcggg 


catggggtcg 


tctgcggcca 


tcgtcgcggc 


6300 


caccacggtc 


ctgttcgaga 


cgctgctgga 


ccggcccaag 


acgcccgaac 


agegcttega 


6360 


ccgcgtccgc 


ttctgcgagc 


ggttgaagca 


cggcaaggcc 


ggtcccatcg 


acgcggccag 


6420 


cgtcgtgcgc 


ggcgggcttg 


tccgcgtggg 


cgggaacggg 


ccgggttcga 


tcagcagctt 


6480 


cgatttgccc 


gaggatcacg 


accttgtcgc 


gggacgcggc 


tggtactggg 


tactgeaegg 


6540 


gcgccccgtc 


agcgggaccg 


gcgaatgcgt 


cagcgcggtc 


gcggcggcgc 


atggtcgcga 


6600 


tgcggcgctg 


tgggacgcct 


tcgcagtctg 


cacccgcgcg 


ttggaggccg 


cgctgctgtc 


6660 


tgggggcagc 


cccgacgccg 


ccatcaccga 


gaaccagcgc 


ctgctggaac 


geateggegt 


6720 


cgtgccggca 


gcgacgcagg 


ccctcgtggc 


ccagatcgag 


gaggcgggtg 


gcgcggccaa 


6780 


gatctgcggc 


gcaggttccg 


tgcggggcga 


tcacggcggg 


gcggtcctcg 


tgcggattga 


6840 


cgacgcgcag 


gcgatggctt 


cggtcatggc 


gcgccatccc 


gacctcgact 


gggcgcccct 


6900 


gcgcatgtcg 


cgcacggggg 


cggcacccgg 


ccccgcgccg 


cgtgcgcaac 


cgctgccggg 


6960 


gcagggctga 


tggatcaggt 


catccgcgcc 


agcgcgccgg 


gttcggtcat 


gatcaeggge 


7020 


gaacatgccg 


tggtctatgg 


acaccgcgcc 


atcgtcgccg 


ggatcgagca 


gcgcgcccat 


7080 


gtgacgatcg 


tcccgcgtgc 


cgaccgcatg 


tttcgcatca 


cctcgcagat 


cggggcgccg 


7140 


cagcaggggt 


cgctggacga 


tctgcctgcg 


ggcgggacct 


atcgcttcgt 


gctggccgcc 


7200 


atcgcgcgac 


acgcgccgga 


cctgccttgc 


gggttcgaca 


tggacatcac 


cteggggate 


7260 


gatccgaggc 


tcgggcttgg 


atcctcggcg 


gcggtgacgg 


tcgcctgcct 


cggcgcgctg 


7320 


tcgcggctgg 


cggggcgggg 


gaccgagggg 


ctgcatgacg 


acgcgctgcg 


catcgtccgc 


7380 


gccatccagg 


gcaggggcag 


cggggccgat 


ctggcggcca 


gcctgcatgg 


cggcttcgtc 


7440 


gcctatcgcg 


cgcccgatgg 


cggtgccgcg 


cagatcgagg 


cgcttccggt 


gccgccgggg 


7500 


ccgttcggcc 


tgcgctatgc 


gggctacaag 


accccgacag 


ccgaggtgct 


gcgccttgtg 


7560 


gccgatcgga 


tggcgggcaa 


cgaggccgct 


ttcgacgcgc 


tctactcccg 


gatgggegea 


7620 


agcgcagatg 


ccgcgatccg 


cgcggcgcaa 


gggctggact 


gggctgcatt 


ccacgacgcg 


7680 


ctgaacgaat 


accagcgcct 


gatggagcag 


ctgggcgtgt 


ccgacgacac 


gctggacgcg 


7740 


atcatccgcg 


aggcgcgcga 


cgcgggcgcc 


gcagtcgcca 


agatctccgg 


cteggggctg 


7800 


ggggattgcg 


tgctggcact 


gggcgaccag 


cccaagggtt 


tcgtgcccgc 


aagcattgee 


7860 


gagaagggac 


ttgttttcg atg act gat 


gcc gtc cgc 


gac atg ate gcc cgt 


7912 



Met Thr Asp Ala Val Arg Asp Met lie Ala Arg 
15 10 
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gcc atg gcg ggc gcg acc gac ate cga gca gec gag get tat gcg ccc 7960 
Ala Met Ala Gly Ala Thr Asp lie Arg Ala Ala Glu Ala Tyr Ala Pro 

15 20 25 

age aac ate gcg ctg teg aaa tac tgg ggc aag cgc gac gcc gcg egg 8008 
Ser Asn lie Ala Leu Ser Lys Tyr Trp Gly Lys Arg Asp Ala Ala Arg 
30 35 40 

aac ctt ccg ctg aac age tec gtc teg ate teg ttg gcg aac tgg ggc 8056 
Asn Leu Pro Leu Asn Ser Ser Val Ser lie Ser Leu Ala Asn Trp Gly 
45 50 55 

tct cat acg egg gtc gag ggg tec ggc acg ggc cac gac gag gtg cat 8104 
Ser His Thr Arg Val Glu Gly Ser Gly Thr Gly His Asp Glu Val His 
60 65 70 75 

cac aac ggc acg ctg ctg gat ccg ggc gac gcc ttc gcg cgc cgc gcg 8152 
His Asn Gly Thr Leu Leu Asp Pro Gly Asp Ala Phe Ala Arg Arg Ala 

80 85 90 

ttg gca ttc get gac ctg ttc egg ggg ggg agg cac ctg ccg ctg egg 8200 
Leu Ala Phe Ala Asp Leu Phe Arg Gly Gly Arg His Leu Pro Leu Arg 

95 100 105 

ate acg acg cag aac teg ate ccg acg gcg gcg ggg ctt gcc teg teg 8248 
lie Thr Thr Gin Asn Ser lie Pro Thr Ala Ala Gly Leu Ala Ser Ser 
110 115 120 

gcc teg ggg ttc gcg gcg ctg acc cgt gcg ctg gcg ggg gcg ttc ggg 8296 
Ala Ser Gly Phe Ala Ala Leu Thr Arg Ala Leu Ala Gly Ala Phe Gly 
125 130 135 

ctg gat ctg gac gac acg gat ctg age cgc ate gcc egg ate ggc agt 8344 
Leu Asp Leu Asp Asp Thr Asp Leu Ser Arg lie Ala Arg lie Gly Ser 
140 145 150 155 

f~ ggc age gcc gcc cgc teg ate tgg cac ggc ttc gtc cgc tgg aac egg 8392 

k — Gly Ser Ala Ala Arg Ser lie Trp His Gly Phe Val Arg Trp Asn Arg 

160 165 170 

ggc gag gcc gag gat ggg cat gac age cac ggc gtc ccg ctg gac ctg 8440 
Gly Glu Ala Glu Asp Gly His Asp Ser His Gly Val Pro Leu Asp Leu 

175 180 185 

cgc tgg ccc ggc ttc cgc ate gcg ate gtg gcc gtg gac aag ggg ccc 8488 
Arg Trp Pro Gly Phe Arg lie Ala lie Val Ala Val Asp Lys Gly Pro 
190 195 200 

aag cct ttc agt teg cgc gac ggc atg aac cac acg gtc gag acc age 8536 
Lys Pro Phe Ser Ser Arg Asp Gly Met Asn His Thr Val Glu Thr Ser 
205 210 215 

ccg ctg ttc ccg ccc tgg cct gcg cag gcg gaa gcg gat tgc cgc gtc 8584 
Pro Leu Phe Pro Pro Trp Pro Ala Gin Ala Glu Ala Asp Cys Arg Val 
220 225 230 235 
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ate gag gat gcg ate gee gee cgc gac atg gee gee ctg ggt ccg egg 

lie Glu Asp Ala lie Ala Ala Arg Asp Met Ala Ala Leu Gly Pro Arg 

240 245 250 

gtc gag gcg aac gee ctt gcg atg cac gee acg atg atg gee gcg cgc 

Val Glu Ala Asn Ala Leu Ala Met His Ala Thr Met Met Ala Ala Arg 

255 260 265 

ccg ccg etc tgc tac ctg acg ggc ggc age tgg cag gtg ctg gaa cgc 

Pro Pro Leu Cys Tyr Leu Thr Gly Gly Ser Trp Gin Val Leu Glu Arg 

270 275 280 

ctg tgg cag gec cgc gcg gac ggg ctt gcg gec ttt gcg acg atg gat 

Leu Trp Gin Ala Arg Ala Asp Gly Leu Ala Ala Phe Ala Thr Met Asp 

285 290 295 



8632 



<400> 53 

Met Thr Asp Ala Val Arg Asp Met lie Ala Arg Ala Met Ala Gly Ala 
15 10 15 



Thr Asp lie Arg Ala Ala Glu Ala Tyr Ala Pro Ser Asn lie Ala Leu 

20 25 30 



Ser Lys Tyr Trp Gly Lys Arg Asp Ala Ala Arg Asn Leu Pro Leu Asn 
35 40 45 



8680 



8728 



8776 



gee ggc ccg aac gtc aag ctg ate ttc gag gaa age age gee gee gac 8824 
Ala Gly Pro Asn Val Lys Leu lie Phe Glu Glu Ser Ser Ala Ala Asp 
300 305 310 315 

gtg ctg tac ctg ttc ccc gac gee age ctg ate gcg ccg ttc gag ggg 8872 
Val Leu Tyr Leu Phe Pro Asp Ala Ser Leu lie Ala Pro Phe Glu Gly 

320 325 330 

cgt tga aegegtaaga cgaccactgg gtaaggttct gccgcgcgtg gtctcgactg 8928 
Arg 

ectgeaaaga ggtgcttgag ttgctgcgtg actgeggegg ccgacttcgt gggacttgee 8988 
cgccacgctg acgcgctgga aacgcgcccg eggattaega ccgcgtcatt gccctgaacc 9048 
aatttcccgt eggtcgae 

<210> 53 

<211> 332 

<212> PRT 

<213> Paracoccus sp. R114 
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Ser Ser Val Ser 
50 



Glu Gly Ser Gly 
65 



Leu Asp Pro Gly 



Leu Phe Arg Gly 

100 



Ser lie Pro Thr 
115 



Ala Leu Thr Arg 
130 



Thr Asp Leu Ser 
145 



Ser lie Trp His 



Gly His Asp Ser 

180 



Arg lie Ala lie 
195 



Arg Asp Gly Met 
210 



Trp Pro Ala Gin 
225 



Ala Ala Arg Asp 



Leu Ala Met His 

260 



He Ser Leu Ala 
55 



Thr Gly His Asp 
70 



Asp Ala Phe Ala 
85 



Gly Arg His Leu 



Ala Ala Gly Leu 

120 



Ala Leu Ala Gly 
135 



Arg He Ala Arg 
150 



Gly Phe Val Arg 
165 



His Gly Val Pro 



Val Ala Val Asp 

200 



Asn His Thr Val 
215 



Ala Glu Ala Asp 
230 



Met Ala Ala Leu 
245 



Ala Thr Met Met 



Asn Trp Gly Ser 

60 



Glu Val His His 
75 



Arg Arg Ala Leu 
90 



Pro Leu Arg He 
105 



Ala Ser Ser Ala 



Ala Phe Gly Leu 

140 



He Gly Ser Gly 
155 



Trp Asn Arg Gly 
170 



Leu Asp Leu Arg 
185 



Lys Gly Pro Lys 



Glu Thr Ser Pro 

220 



Cys Arg Val He 
235 



Gly Pro Arg Val 
250 



Ala Ala Arg Pro 
265 



63 



His Thr Arg Val 



Asn Gly Thr Leu 

80 



Ala Phe Ala Asp 
95 



Thr Thr Gin Asn 
110 



Ser Gly Phe Ala 
125 



Asp Leu Asp Asp 



Ser Ala Ala Arg 

160 



Glu Ala Glu Asp 
175 



Trp Pro Gly Phe 
190 



Pro Phe Ser Ser 
205 



Leu Phe Pro Pro 



Glu Asp Ala He 

240 



Glu Ala Asn Ala 
255 



Pro Leu Cys Tyr 
270 
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Leu Thr Gly Gly Ser Trp Gin Val Leu Glu Arg Leu Trp Gin Ala Arg 
275 280 285 



Ala Asp Gly Leu Ala Ala Phe Ala Thr Met Asp Ala Gly Pro Asn Val 
290 295 300 



Lys Leu He Phe Glu Glu Ser Ser Ala Ala Asp Val Leu Tyr Leu Phe 
305 310 315 320 



Pro Asp Ala Ser Leu He Ala Pro Phe Glu Gly Arg 

325 330 



<210> 54 

<211> 353 

<212> PRT 

<213> Streptomyces sp. strain CL190 



<400> 54 

Met Thr Glu Thr His Ala He Ala Gly Val Pro Met Arg Trp Val Gly 
15 10 15 

Pro Leu Arg He Ser Gly Asn Val Ala Glu Thr Glu Thr Gin Val Pro 

20 25 30 

Leu Ala Thr Tyr Glu Ser Pro Leu Trp Pro Ser Val Gly Arg Gly Ala 
35 40 45 

Lys Val Ser Arg Leu Thr Glu Lys Gly He Val Ala Thr Leu Val Asp 
50 55 60 

Glu Arg Met Thr Arg Ser Val He Val Glu Ala Thr Asp Ala Gin Thr 
65 70 75 80 

Ala Tyr Met Ala Ala Gin Thr lie His Ala Arg He Asp Glu Leu Arg 

85 90 95 

Glu Val Val Arg Gly Cys Ser Arg Phe Ala Gin Leu He Asn He Lys 

100 105 HO 

His Glu He Asn Ala Asn Leu Leu Phe lie Arg Phe Glu Phe Thr Thr 
115 120 125 

Gly Asp Ala Ser Gly His Asn Met Ala Thr Leu Ala Ser Asp Val Leu 
130 135 140 

Leu Gly His Leu Leu Glu Thr lie Pro Gly lie Ser Tyr Gly Ser He 
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145 150 155 160 

Ser Gly Asn Tyr Cys Thr Asp Lys Lys Ala Thr Ala lie Asn Gly lie 

165 170 175 

Leu Gly Arg Gly Lys Asn Val He Thr Glu Leu Leu Val Pro Arg Asp 

180 185 190 

Val Val Glu Asn Asn Leu His Thr Thr Ala Ala Lys He Val Glu Leu 
195 200 205 

Asn He Arg Lys Asn Leu Leu Gly Thr Leu Leu Ala Gly Gly He Arg 
210 215 220 

Ser Ala Asn Ala His Phe Ala Asn Met Leu Leu Gly Phe Tyr Leu Ala 
225 230 235 240 

Thr Gly Gin Asp Ala Ala Asn lie Val Glu Gly Ser Gin Gly Val Val 

245 250 255 

Met Ala Glu Asp Arg Asp Gly Asp Leu Tyr Phe Ala Cys Thr Leu Pro 

260 265 270 

Asn Leu lie Val Gly Thr Val Gly Asn Gly Lys Gly Leu Gly Phe Val 
275 280 285 

Glu Thr Asn Leu Ala Arg Leu Gly Cys Arg Ala Asp Arg Glu Pro Gly 
290 295 300 

Glu Asn Ala Arg Arg Leu Ala Val He Ala Ala Ala Thr Val Leu Cys 
305 310 315 320 

Gly Glu Leu Ser Leu Leu Ala Ala Gin Thr Asn Pro Gly Glu Leu Met 

325 330 335 

Arg Ala His Val Gin Leu Glu Arg Asp Asn Lys Thr Ala Lys Val Gly 

340 345 350 



(, * Ala 



<210> 55 

<211> 353 

<212> PRT 

<213> Streptomyces griseolosporeus 



<400> 55 

Met Thr Glu Ala His Ala Thr Ala Gly Val Pro Met Arg Trp Val Gly 
15 10 15 

Pro Val Arg lie Ser Gly Asn Val Ala Thr lie Glu Thr Gin Val Pro 
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20 



25 30 



Leu Ala Thr Tyr Glu Ser Pro Leu Trp Pro Ser Val Gly Arg Gly Ala 
35 40 45 

Lys Val Ser Arg Leu Thr Glu Lys Gly He Val Ala Thr Leu Val Asp 
50 55 60 

Glu Arg Met Thr Arg Ser Val Leu Val Glu Ala Thr Asp Ala Leu Thr 
65 70 75 80 

Ala Leu Ser Ala Ala Arg Thr He Glu Ala Arg He Asp Glu Leu Arg 

85 90 95 

Glu Leu Val Arg Gly Cys Ser Arg Phe Ala Gin Leu He Gly He Arg 

100 105 HO 

His Glu He Thr Gly Asn Leu Leu Phe Val Arg Phe Glu Phe Ser Thr 
115 120 125 

Gly Asp Ala Ser Gly His Asn Met Ala Thr Leu Ala Ser Asp Val Leu 
130 135 140 

Leu Gin His Leu Leu Glu Thr Val Pro Gly He Ser Tyr Gly Ser He 
145 150 155 160 

Ser Gly Asn Tyr Cyr> Thr Asp Lys Lys Ala Thr Ala He Asn Gly He 

165 170 175 

Leu Gly Arg Gly Lys Asn Val Val Thr Glu Leu Leu Val Pro Arg Asp 

180 185 190 

Val Val Ala Asp Val Leu Asn Thr Thr Ala Ala Lys He Ala Glu Leu 
195 200 205 

Asn Leu Arg Lys Asn Leu Leu Gly Thr Leu Leu Ala Gly Gly He Arg 
210 215 220 

Ser Ala Asn Ala His Tyr Ala Asn Met Leu Leu Ala Phe Tyr Leu Ala 
225 230 235 240 

Thr Gly Gin Asp Ala Ala Asn He Val Glu Gly Ser Gin Gly Val Val 

245 250 255 

Thr Ala Glu Asp Arg Asp Gly Asp Leu Tyr Leu Ala Cys Thr Leu Pro 

260 265 270 

Asn Leu He Val Gly Thr Val Gly Asn Gly Lys Gly Leu Gly Phe Val 
275 280 285 

Glu Thr Asn Leu Asn Arg Leu Gly Cys Arg Ala Asp Arg Glu Pro Gly 
290 295 300 

Glu Asn Ala Arg Arg Leu Ala Val He Ala Ala Ala Thr Val Leu Cys 
305 310 315 320 

Gly Glu Leu Ser Leu Leu Ala Ala Gin Thr Asn Pro Gly Glu Leu Met 
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325 330 335 

Arg Ala His Val Gin Leu Glu Arg Gly His Thr Thr Ala Lys Ala Gly 

340 345 350 

Val 



<210> 56 

<211> 353 

<212> PRT 

<213> Streptomyces sp . strain KO-3899 



<400> 56 

Met Thr Asp Thr His Ala lie Ala Met Val Pro Met Lys Trp Val Gly 
1 5 10 15 

Pro Leu Arg lie Ser Gly Asn Val Ala Thr Thr Glu Thr His Val Pro 

20 25 30 

Leu Ala Thr Tyr Glu Thr Pro Leu Trp Pro Ser Val Gly Arg Gly Ala 
35 40 45 

Lys Val Ser Met Leu Ser Glu Arg Gly lie Ala Ala Thr Leu Val Asp 
50 55 60 

Glu Arg Met Thr Arg Ser Val Leu Val Glu Ala Thr Asp Ala Gin Thr 
65 70 75 80 

Ala Tyr Thr Ala Ala Arg Ala lie Glu Ala Arg lie Glu Glu Leu Arg 

85 90 95 

Ala Val Val Arg Thr Cys Ser Arg Phe Ala Glu Leu Leu Gin Val Arg 

100 105 no 

His Glu lie Ala Gly Asn Leu Leu Phe Val Arg Phe Glu Phe Ser Thr 
115 120 125 

Arg Arg Pro Ser Gly His Asn Met Ala Thr Leu Ala Ser Asp Ala Leu 
130 135 140 

Leu Ala His Leu Leu Gin Thr lie Pro Gly lie Ser Tyr Gly Ser lie 
145 150 155 160 

Ser Gly Asn Tyr Cys Thr Asp Lys Lys Ala Thr Ala lie Asn Gly He 

165 170 175 

Leu Gly Arg Gly Lys Asn Val Val Thr Glu Leu Val Val Pro Arg Glu 

180 185 190 

Val Val Glu Arg Val Leu His Thr Thr Ala Ala Lys He Val Glu Leu 
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195 200 205 

Asn lie Arg Lys Asn Leu Leu Gly Thr Leu Leu Ala Gly Gly He Arg 
210 215 220 

Ser Ala Asn Ala His Tyr Ala Asn Met Leu Leu Gly Phe Tyr Leu Ala 
225 230 235 240 

Thr Gly Gin Asp Ala Ala Asn He Val Glu Gly Ser Gin Gly Val Thr 

245 250 255 

Leu Ala Glu Asp Arg Asp Gly Asp Leu Tyr Phe Ser Cys Asn Leu Pro 

260 265 270 

Asn Leu He Val Gly Thr Val Gly Asn Gly Lys Gly Leu Glu Phe Val 
275 280 285 

Glu Thr Asn Leu Asn Arg Leu Gly Cys Arg Glu Asp Arg Ala Pro Gly 

290 295 300 v. ' 

Glu Asn Ala Arg Arg Leu Ala Val He Ala Ala Ala Thr Val Leu Cys 
305 310 315 320 

Gly Glu Leu Ser Leu Leu Ala Ala Gin Thr Asn Pro Gly Glu Leu Met 

325 330 335 

Arg Ala His Val Glu Leu Glu Arg Asp Asn Thr Thr Ala Glu Val Gly 

340 345 350 

Val 

<210> 57 
<211> 347 
<212> PRT 

( ' 

<213> Erwinia herbicola ^ 



<400> 57 

Met Lys Asp Glu 
1 

Val Leu Asp Pro 

20 

Arg Trp Arg Phe 
35 

He Thr Leu Glu 
50 

Leu He Ser Ser 



Arg Leu Val Gin 
5 

Arg Arg Ala Val 



Thr His Cys Ala 

40 

Thr Thr Phe Leu 
55 

Met Thr Gly Gly 



Arg Lys Asn Asp 
10 

Thr Gin Ala Ser 
25 

Leu Pro Glu Leu 

Asn Arg Gin Leu 

60 

Val Glu Arg Ser 

68 



His Leu Asp He 
15 

Ala Gly Phe Glu 
30 

Asn Phe Ser Asp 
45 

Gin Ala Pro Leu 



Arg His He Asn 
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65 70 75 80 

Arg His Leu Ala Glu Ala Ala Gin Val Leu Lys He Ala Met Gly Val 

85 90 95 

Gly Ser Gin Arg Val Ala He Glu Ser Asp Ala Gly Leu Gly Leu Asp 

100 105 HO 

Lys Thr Leu Arg Gin Leu Ala Pro Asp Val Pro Leu Leu Ala Asn Leu 
115 120 125 

Gly Ala Ala G\n Leu Thr Gly Arg Lys Gly He Asp Tyr Ala Arg Arg 
130 135 140 

Ala Val Glu Met He Glu Ala Asp Ala Leu He Val His Leu Asn Pro 
145 150 155 160 

Leu Gin Glu Ala Leu Gin Pro Gly Gly Asp Arg Asp Trp Arg Gly Arg 

165 170 175 

Leu Ala Ala He Glu Thr Leu Val Arg Glu Leu Pro Val Pro Leu Val 

180 185 190 

Val Lys Glu Val Gly Ala Gly He Ser Arg Thr Val Ala Gly Gin Leu 
195 200 205 

He Asp Ala Gly Val Thr Val He Asp Val Ala Gly Ala Gly Gly Thr 
210 215 220 

Ser Trp Ala Ala Val Glu Gly Glu Arg Ala Ala Thr Glu Gin Gin Arg 
225 230 235 240 

Ser Val Ala Asn Val Phe Ala Asp Trp Gly He Pro Thr Ala Glu Ala 

245 250 255 

Leu Val Asp He Ala Glu Ala Trp Pro Gin Met Pro Leu He Ala Ser 

260 265 270 

Gly Gly lie Lys Asn Gly Val Asp Ala Ala Lys Ala Leu Arg Leu Gly 
275 280 285 

Ala Cys Met Val Gly Gin Ala Ala Ala Val Leu Gly Ser Ala Gly Val 
290 295 300 

Ser Thr Glu Lys Val He Asp His Phe Asn Val He He Glu Gin Leu 
305 310 315 320 

Arg Val Ala Cys Phe Cys Thr Gly Ser Arg Ser Leu Ser Asp Leu Lys 

325 330 335 

Gin Ala Asp He Arg Tyr Val Arg Asp Thr Pro 

340 345 

<210> 58 
<211> 360 
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<212> PRT 
<213> Borrelia 



<400> 58 

Met Met Asp Thr 
1 

Lys Lys Arg His 

20 

Gly Cys Asn Phe 
35 

Asp Phe Asn Phe 
50 

Asn lie Ser Met 
65 

Glu Gly Asn Asp 



Lys lie Pro lie 

100 

Glu Tyr lie Arg 
115 

Leu Phe Ala Asn 
130 

Lys lie Ala Glu 
145 

His Leu Asn Ala 



Phe Lys Gly lie 

180 

Val Pro Leu lie 
195 

Val Lys Glu Leu 
210 

Ser Gly Gly Thr 
225 

Leu Asn He Ala 



Thr Leu Leu Ser 



burgdorferi 



Glu Phe Met Gly 
5 

He Glu He Cys 



Leu Lys Phe He 

40 

Ser Glu He Asn 
55 

Pro Val Phe He 
70 

Phe Asn Lys Ser 
85 

Gly Leu Gly Ser 



Asp Phe Thr Leu 

120 

Val Gly Ala Val 
135 

Met He Lys Arg 
150 

Gly Gin Glu Leu 
165 

Arg Glu Ser He 



Val Lys Glu Thr 

200 

Phe Ser Leu Gly 
215 

Asn Trp He Leu 
230 

Ser Cys Phe Ser 
245 

He Asp Asp Ser 



He Glu Pro Asn 
10 

Leu Asn Lys Asn 
25 

Lys Leu Lys His 



He Lys Glu Glu 

60 

Ser Ser Met Thr 
75 

Leu Val Arg He 
90 

Phe Lys Leu Leu 
105 

Lys Arg Tyr Ala 



Gin He Val Glu 

140 

Leu Glu Val Asp 
155 

Met Lys Val Asp 
170 

Ala Lys Leu Ser 
185 

Gly Phe Gly lie 



Ala Ser Tyr Val 

220 

Val Glu Gly Met 
235 

Asp Trp Gly lie 
250 

Leu Lys Ala Asn 



lie Leu Glu Asn 
15 

Asp Val Lys Gly 
30 

Asn Ala Leu Ser 
45 

He Phe Gly Tyr 



Gly Gly Ser Lys 

80 

Ala Asn Tyr Leu 
95 

Phe Lys Tyr Pro 
110 

His Asn lie Pro 
125 

Phe Gly lie Ser 



Ala lie lie Val 

160 

Gly Asp Arg Asn 
175 

Asp Phe Leu Ser 
190 

Ser Pro Lys Asp 
205 

Asp Leu Ala Gly 



Lys Ser Asn Asn 

240 

Pro Ser Val Phe 
255 

He Phe Ala Ser 
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260 265 270 

Gly Gly Tyr Glu Thr Gly Met Asp lie Ala Lys Gly lie Ala Leu Gly 
275 280 285 

Ala Arg Leu lie Gly Val Ala Ala Val Val Leu Arg Ala Phe Tyr Asp 
290 295 300 

Ser Gly Glu Asp Ala Val Phe Gly Leu Phe Ser Asp Tyr Glu His He 
305 310 315 320 

Leu Lys Met Ser Met Phe Leu Ser Gly Ser Lys Ser Leu Leu Glu Phe 

325 330 335 

Arg Asn Asn Lys Tyr Phe Leu Ser Ser Tyr Leu Leu Asp Glu Leu Gly 

340 345 350 

Val Phe Lys Gin Phe Tyr Gly Thr 
355 360 

<210> 59 

<211> 349 

<212> PRT 

<213> Synechocystis sp . PCC 6803 



<400> 59 

Met Asp Ser Thr Pro His Arg Lys Ser Asp His He Arg He Val Leu 
15 10 15 

Glu Glu Asp Val Val Gly Lys Gly He Ser Thr Gly Phe Glu Arg Leu 

20 25 30 

Met Leu Glu His Cys Ala Leu Pro Ala Val Asp Leu Asp Ala Val Asp 
35 40 45 

Leu Gly Leu Thr Leu Trp Gly Lys Ser Leu Thr Tyr Pro Trp Leu He 
50 55 60 

♦ 

Ser Ser Met Thr Gly Gly Thr Pro Glu Ala Lys Gin He Asn Leu Phe 
65 70 75 80 

Leu Ala Glu Val Ala Gin Ala Leu Gly lie Ala Met Gly Leu Gly Ser 

85 90 95 

Gin Arg Ala Ala lie Glu Asn Pro Asp Leu Ala Phe Thr Tyr Gin Val 

100 105 110 

Arg Ser Val Ala Pro Asp lie Leu Leu Phe Ala Asn Leu Gly Leu Val 
115 120 125 

Gin Leu Asn Tyr Gly Tyr Gly Leu Glu Gin Ala Gin Arg Ala Val Asp 
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130 135 140 

Met lie Glu Ala Asp Ala Leu lie Leu His Leu Asn Pro Leu Gin Glu 
145 150 155 160 

Ala Val Gin Pro Asp Gly Asp Arg Leu Trp Ser Gly Leu Trp Ser Lys 

165 170 175 

Leu Glu Ala Leu Val Glu Ala Leu Glu Val Pro Val He Val Lys Glu 

180 185 190 

Val Gly Asn Gly He Ser Gly Pro Val Ala Lys Arg Leu Gin Glu Cys 
195 200 205 

Gly Val Gly Ala He Asp Val Ala Gly Ala Gly Gly Thr Ser Trp Ser 
210 215 220 

Glu Val Glu Ala His Arg Gin Thr Asp Arg Gin Ala Lys Glu Val Ala 
225 230 235 240 

His Asn Phe Ala Asp Trp Gly Leu Pro Thr Ala Trp Ser Leu Gin Gin 

245 250 255 

Val Val Gin Asn Thr Glu Gin He Leu Val Phe Ala Ser Gly Gly He 

260 265 270 

Arg Ser Gly He Asp Gly Ala Lys Ala He Ala Leu Gly Ala Thr Leu 
275 280 285 

Val Gly Ser Ala Ala Pro Val Leu Ala Glu Ala Lys He Asn Ala Gin 
290 295 300 

Arg Val Tyr Asp His Tyr Gin Ala Arg Leu Arg Glu Leu Gin He Ala 
305 310 315 320 

Ala Phe Cys Cys Asp Ala Ala Asn Leu Thr Gin Leu Ala Gin Val Pro 

325 330 335 

Leu Trp Asp Arg Gin Ser Gly Gin Arg Leu Thr Lys Pro 

340 345 

<210> 60 
<211> 361 
<212> PRT 

<213> Streptomyces sp. CL190 



<400> 60 

Met Thr Ser Ala Gin Arg Lys Asp Asp His Val Arg Leu Ala He Glu 
15 10 15 

Gin His Asn Ala His Ser Gly Arg Asn Gin Asp Asp Val Ser Phe Val 
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20 25 30 

His His Ala Leu Ala Gly lie Asp Arg Pro Asp Val Ser Leu Ala Thr 
35 40 45 

Ser Phe Ala Gly lie Ser Trp Gin Val Pro lie Tyr lie Asn Ala Met 
50 55 60 

Thr Gly Gly Ser Glu Lys Thr Gly Leu lie Asn Arg Asp Leu Ala Thr 
65 70 75 80 

Ala Ala Arg Glu Thr Gly Val Pro lie Ala Ser Gly Ser Met Asn Ala 

85 90 95 

Tyr lie Lys Asp Pro Cys Ala Asp Thr Phe Arg Val Leu Arg Asp Glu 

100 105 110 

Asn Pro Asn Gly Phe Val lie Ala Asn lie Asn Ala Thr Thr Thr Val 
115 120 . 125 

Asp Asn Ala Gin Arg Ala lie Asp Leu lie Glu Ala Asn Ala Leu Gin 
130 135 140 

He His He Asn Thr Ala Gin Glu Thr Pro Met Pro Glu Gly Asp Arg 
145 150 155 160 

Ser Phe Ala Ser Trp Val Pro Gin He Glu Lys He Ala Ala Ala Val 

165 170 175 

Asp He Pro Val He Val Lys Glu Val Gly Asn Gly Leu Ser Arg Gin 

180 185 190 

Thr He Leu Leu Leu Ala Asp Leu Gly Val Gin Ala Ala Asp Val Ser 
195 200 205 

Gly Arg Gly Gly Thr Asp Phe Ala Arg He Glu Asn Gly Arg Arg Glu 
210 215 220 

Leu Gly Asp Tyr Ala Phe Leu His Gly Trp Gly Gin Ser Thr Ala Ala 
225 230 235 240 

Cys Leu Leu Asp Ala Gin Asp He Ser Leu Pro Val Leu Ala Ser Gly 

245 250 255 

Gly Val Arg His Pro Leu Asp Val Val Arg Ala Leu Ala Leu Gly Ala 

260 265 270 

Arg Ala Val Gly Ser Ser Ala Gly Phe Leu Arg Thr Leu Met Asp Asp 
275 280 285 

Gly Val Asp Ala Leu He Thr Lys Leu Thr Thr Trp Leu Asp Gin Leu 
290 295 300 

Ala Ala Leu Gin Thr Met Leu Gly Ala Arg Thr Pro Ala Asp Leu Thr 
305 310 315 320 

Arg Cys Asp Val Leu Leu His Gly Glu Leu Arg Asp Phe Cys Ala Asp 
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325 330 335 

Arg Gly lie Asp Thr Arg Arg Leu Ala Gin Arg Ser Ser Ser He Glu 

340 345 350 

Ala Leu Gin Thr Thr Gly Ser Thr Arg 
355 360 

<210> 61 

<211> 364 

<212> PRT 

<213> Streptomyces griseolosporeus 



<400> 61 

Met Ser Ser Ala Gin Arg Lys Asp Asp His Val Arg Leu Ala Thr Glu 
! 5 10 15 

Gin Gin Arg Ala His Ser Gly Arg Asn Gin Phe Asp Asp Val Ser Phe 

20 25 30 

Val His His Ala Leu Ala Gly He Asp Arg Pro Asp Val Arg Leu Ala 
35 40 45 

Thr Thr Phe Ala Gly He Thr Trp Arg Leu Pro Leu Tyr He Asn Ala 
50 55 60 

Met Thr Gly Gly Ser Ala Lys Thr Gly Ala He Asn Arg Asp Leu Ala 
65 70 75 80 

Val Ala Ala Arg Glu Thr Gly Ala Ala He Ala Ser Gly Ser Met His 

85 90 95 

Ala Phe Phe Arg Asp Pro Ser Cys Ala Asp Thr Phe Arg Val Leu Arg £ 

100 105 HO 

Thr Glu Asn Pro Asp Gly Phe Val Met Ala Asn Val Asn Ala Thr Ala 
115 120 125 

Ser Val Asp Asn Ala Arg Arg Ala Val Asp Leu He Glu Ala Asn Ala 
130 135 140 

Leu Gin He His Leu Asn Thr Ala Gin Glu Thr Pro Met Pro Glu Gly 
145 150 155 160 

Asr> Arg Ser Phe Gly Ser Trp Pro Ala Gin He Ala Lys He Thr Ala 

165 170 175 

Ala Val Asp Val Pro Val He Val Lys Glu Val Gly Asn Gly Leu Ser 

180 185 190 

Arg Gin Thr Leu Leu Ala Leu Pro Asp Leu Gly Val Arg Val Ala Asp 
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195 



200 205 



Val Ser Gly Arg Gly Gly Thr Asp Phe Ala Arg He Glu Asn Ser Arg 
210 215 220 

Arg Pro Leu Gly Asp Tyr Ala Phe Leu His Gly Trp Gly Gin Ser Thr 
225 230 235 ^0 

Pro Ala Cys Leu Leu Asp Ala Gin Asp Val Gly Phe Pro Leu Leu Ala 

245 250 255 

Ser Gly Gly He Arg Asn Pro Leu Asp Val Ala Arg Ala Leu Ala Leu 

260 265 270 

Glv Ala Gly Ala Val Gly Ser Ser Gly Val Phe Leu Arg Thr Leu He 
275 280 285 

Asp Gly Gly Val Ser Ala Leu Val Ala Gin He Ser Thr Trp Leu Asp 
290 295 300 

Gin Leu Ala Ala Leu Gin Thr Met Leu Gly Ala Arg Thr Pro Ala Asp 
3 05 310 315 320 

Leu Thr Arg Cys Asp Val Leu He His Gly Pro Leu Arg Ser Phe Cys 

325 330 335 

Thr Asp Arg Gly lie Asp He Gly Arg Phe Ala Arg Arg Ser Ser Ser 

340 345 350 

Ala Asp He Arg Ser Glu Met Thr Gly Ser Thr Arg 
355 360 

<210> 62 

<211> 368 

<212> PRT 

<213> Sulfolobus solfataricus 



<400> 62 

Met Pro Asp He Val Asn Arg Lys 
1 5 

Phe Glu Asn Val Asp Gly Leu Ser 

20 

He Leu Val His Gin Gly Phe Pro 
35 40 

Thr Lys Thr Lys Phe Phe Arg Lys 
50 55 

Thr Gly Met Thr Gly Gly Arg Asn 



Val Glu His Val Glu He Ala Ala 

10 15 

Ser Ser Thr Phe Leu Asn Asp Val 

25 30 

Gly He Ser Phe Ser Glu He Asn 

45 

Glu He Ser Ala Pro He Met Val 

60 

Glu Leu Gly Arg He Asn Arg He 
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65 70 75 80 

lie Ala Glu Val Ala Glu Lys Phe Gly lie Pro Met Gly Val Gly Ser 

85 90 95 

Gin Arg Val Ala lie Glu Lys Ala Glu Ala Arg Glu Ser Phe Thr lie 

100 105 HO 

Val Arg Lys Val Ala Pro Thr lie Pro lie lie Ala Asn Leu Gly Met 
115 120 125 

Pro Gin Leu Val Lys Gly Tyr Gly Leu Lys Glu Phe Gin Asp Ala lie 
130 135 140 

Gin Met lie Glu Ala Asp Ala lie Ala Val His Leu Asn Pro Ala Gin 
145 150 155 160 

Glu Val Phe Gin Pro Glu Gly Glu Pro Glu Tyr Gin lie Tyr Ala Leu 

165 170 175 

Glu Arg Leu Arg Asp lie Ser Lys Glu Leu Ser Val Pro lie lie Val 

180 185 190 

Lys Glu Ser Gly Asn Gly lie Ser Met Glu Thr Ala Lys Leu Leu Tyr 
195 200 205 

Ser Tyr Gly lie Lys Asn Phe Asp Thr Ser Gly Gin Gly Gly Thr Asn 
210 215 220 

Trp lie Ala lie Glu Met He Arg Asp He Arg Arg Gly Asn Trp Lys 
225 230 235 240 

Ala Glu Ser Ala Lys Asn Phe Leu Asp Trp Gly Val Pro Thr Ala Ala 

245 250 255 

Ser He He Glu Val Arg Tyr Ser lie Pro Asp Ala Phe Leu Val Gly 

260 265 270 

Ser Gly Gly He Arg Ser Gly Leu Asp Ala Ala Lys Ala He Ala Leu 

275 280 285 V .' 

Gly Ala Asp He Ala Gly Met Ala Leu Pro Val Leu Lys Ser Ala He 
290 295 300 

Glu Gly Lys Glu Ser Leu Glu Gin Phe Phe Arg Lys He He Phe Glu 
305 310 315 320 

Leu Lys Ala Thr Met Met Leu Thr Gly Ser Lys Asn Val Glu Ala Leu 

325 330 335 

Lys Arg Ser Ser He Val He Leu Gly Lys Leu Lys Glu Trp Ala Glu 

340 345 350 

Tyr Arg Gly He Asn Leu Ser He Tyr Glu Lys Val Arg Lys Arg Glu 
355 360 365 

<210> 63 
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<211> 342 
<212> PRT 

<213> Rickettsia prowazekii 



<400> 63 

Met Pro Lys Glu Gin Asn Leu Asp lie Glu Arg Lys Gin Glu His lie 
! 5 10 15 

Glu lie Asn Leu Lys Gin Asn Val Asn Ser Thr Leu Lys Ser Gly Leu 

20 25 30 

Glu Ser lie Lys Phe lie His Asn Ala Leu Pro Glu lie Asn Tyr Asp 
35 40 45 

Ser He Asp Thr Thr Thr Thr Phe Leu Gly Lys Asp Met Lys Ala Pro 
50 55 60 

He Leu He Ser Ser Met Thr Gly Gly Thr Ala Arg Ala Arg Asp He 
65 70 75 80 

Asn Tyr Arg Leu Ala Gin Ala Ala Gin Lys Ser Gly He Ala Met Gly 

85 90 95 

Leu Gly Ser Met Arg He Leu Leu Thr Lys Pro Asp Thr He Lys Thr 

100 105 HO 

Phe Thr Val Arg His Val Ala Pro Asp He Pro Leu Leu Ala Asn He 
115 120 125 

Gly Ala Val Gin Leu Asn Tyr Gly Val Thr Pro Lys Glu Cys Gin Tyr 
130 135 140 

(~\ Leu lie Asp Thr He Lys Ala Asp Ala Leu lie Leu His Leu Asn Val 

W 145 150 155 160 

Leu His Glu Leu Thr Gin Pro Glu Gly Asn Lys Asn Trp Glu Asn Leu 

165 170 175 

Leu Pro Lys He Lys Glu Val lie Asn Tyr Leu Ser Val Pro Val lie 

180 185 190 

Val Lys Glu Val Gly Tyr Gly Leu Ser Lys Gin Val Ala Lys Lys Leu 
195 200 205 

lie Lys Ala Gly Val Lys Val Leu Asp lie Ala Gly Ser Gly Gly Thr 
210 215 220 

Ser Trp Ser Gin Val Glu Ala Tyr Arg Ala Lys Asn Ser Met Gin Asn 
225 230 235 240 

Arg lie Ala Ser Ser Phe lie Asn Trp Gly lie Thr Thr Leu Asp Ser 
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245 250 255 

Leu Lys Met Leu Gin Glu lie Ser Lys Asp He Thr He He Ala Ser 

260 265 270 

Gly Gly Leu Gin Ser Gly He Asp Gly Ala Lys Ala lie Arg Met Gly 
275 280 285 

Ala Asn He Phe Gly Leu Ala Gly Lys Leu Leu Lys Ala Ala Asp He 
290 295 300 

Ala Glu Ser Leu Val Leu Glu Glu He Gin Val He lie Glu Gin Leu 
305 310 315 320 

Lys He Thr Met Leu Cys Thr Gly Ser Cys Thr Leu Lys Asp Leu Ala 

325 330 335 

Lys Ala Glu He Met Trp 

340 

<210> 64 
<211> 286 
<212> PRT 

<213> Deinococcus radiodurans 



<400> 64 

Met Arg Leu Asp Thr Val Phe Leu Gly Arg Arg Leu Lys Ala Pro Val 
15 10 15 

Leu He Gly Ala Met Thr Gly Gly Ala Glu Lys Ala Gly Val He Asn 

20 25 30 

Arg Asn Leu Ala Thr Ala Ala Arg Asn Leu Gly Leu Gly Met Met Leu 
35 40 45 

Gly Ser Gin Arg Val Met Leu Glu His Pro Asp Ala Trp Glu Ser Phe 
50 55 60 

Asn Val Arg Glu Val Ala Pro Glu He Leu Leu lie Gly Asn Leu Gly 
65 70 75 80 

Ala Ala Gin Phe Met Leu Gly Tyr Gly Ala Glu Gin Ala Arg Arg Ala 

85 90 95 

Val Asp Glu Val Met Ala Asp Ala Leu Ala He His Leu Asn Pro Leu 

100 105 110 

Gin Glu Ala Leu Gin Arg Gly Gly Asp Thr Arg Trp Gin Gly Val Thr 
115 120 125 

Tyr Arg Leu Lys Gin Val Ala Arg Glu Leu Asp Phe Pro Val lie lie 
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130 135 140 

Lys Glu Val Gly His Gly Leu Asp Ala Ala Thr Leu Arg Ala Leu Ala 
145 150 155 160 

Asp Gly Pro Phe Ala Ala Tyr Asp Val Ala Gly Ala Gly Gly Thr Ser 

165 170 175 

Trp Ala Arg Val Glu Gin Leu Val Ala His Gly Gin Val His Ser Pro 

180 185 190 

Asp Leu Cys Glu Leu Gly Val Pro Thr Ala Gin Ala Leu Arg Gin Ala 
195 200 205 

Arg Lys Thr Leu Pro Gly Ala Gin Leu lie Ala Ser Gly Gly lie Arg 
210 215 220 

GSer Gly Leu Asp Ala Ala Arg Ala Leu Ser Leu Gly Ala Glu Val Val 
; 225 230 235 240 

Ala Val Ala Arg Pro Leu Leu Glu Pro Ala Leu Asp Ser Ser Glu Ala 

245 250 255 

Ala Glu Ala Trp Leu Arg Asn Phe lie Gin Glu Leu Arg Val Ala Leu 

260 265 270 

Phe Val Gly Gly Tyr Arg Asp Val Arg Glu Val Arg Gly Gly 
275 280 285 

<210> 65 

<211> 361 

<212> PRT 

<213> Aeropyrum pernix 



<400> 65 . 

Met lie Val Ser Ser Lys Val Glu Ser Arg Glu Ser Thr Leu Leu Glu 
1 5 .10 15 

Tyr Val Arg lie Val His Asn Pro Thr Pro Glu Val Asn Leu Gly Asp 

20 25 30 

Val Ser Leu Glu lie Asp Phe Cys Gly Gly Arg Leu Arg Ala Pro Leu 
35 40 45 

Val lie Thr Gly Met Thr Gly Gly His Pro Asp Val Glu Trp lie Asn 
50 55 60 

Arg Glu Leu Ala Ser Val Ala Glu Glu Leu Gly lie Ala lie Gly Val 
65 70 75 80 

Gly Ser Gin Arg Ala Ala lie Glu Asp Pro Ser Leu Ala Arg Thr Phe 
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85 90 95 

Arg Ala Ala Arg Glu Ala Ala Pro Asn Ala Phe Leu He Ala Asn Leu 

100 105 HO 

Gly Ala Pro Gin Leu Ser Leu Gly Tyr Ser Val Arg Glu Val Arg Met 
115 120 125 

Ala Val Glu Met He Asp Ala Asp Ala He Ala He His Leu Asn Pro 
130 135 140 

Glv Gin Glu Ala Tyr Gin Pro Glu Gly Asp Pro Phe Tyr Arg Gly Val 
14 5 150 155 160 

Val Gly Lys He Ala Glu Ala Ala Glu Ala Ala Gly Val Pro Val He 

165 170 175 

Val Lys Glu Thr Gly Asn Gly Leu Ser Arg Glu Ala Val Ala Gin Leu 

180 185 190 

Arg Ala Leu Gly Val Arg Cys Phe Asp Val Ala Gly Leu Gly Gly Thr 
195 200 205 

Asn Trp He Lys He Glu Val Leu Arg Gly Arg Lys Ala Gly Ser Pro 
210 215 220 

Leu Glu Ala Gly Pro Leu Gin Asp Phe Trp Gly Asn Pro Thr Ala Ala 
225 230 235 240 

Ala Leu Met Glu Ala Arg Thr Ala Ala Pro Asp Ala Tyr He He Ala 

245 250 255 

Ser Gly Gly Val Arg Asn Gly Leu Asp Ala Ala Arg Ala He Ala Leu 

260 265 270 

Gly Ala Asp Ala Ala Gly Val Ala Leu Pro Ala He Arg Ser Leu Leu 
275 280 285 

Ser Gly Gly Arg Gin Ala Thr Leu Lys Leu Leu Lys Ala He Glu Tyr ( 
290 295 300 

Gin Leu Lys Thr Ala Val Tyr Met Val Gly Glu Thr Arg Val Arg Gly 
305 310 315 320 

Leu Trp Arg Ala Pro He Val Val Trp Gly Arg Leu Ala Glu Glu Ala 

325 330 335 

Glu Ala Arg Gly He Asp Pro Arg Trp Tyr Thr Asn Thr Leu Arg Leu 

340 345 350 

Glu Ala Leu Val Tyr Lys Asp Val Lys 
355 360 

<210> 66 
<211> 379 
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<212> PRT 

<213> Halobacterium sp. NRC-1 




<400> 66 

Met Gly Glu Ser Arg Tyr Asn Ser lie Val Phe Pro Ser Leu Val Gin 
1 5 10 15 

Thr Arg Leu M~t Thr Ala Gin Asp Ser Thr Gin Thr Glu Asp Arg Lys 

no 25 30 

Asp Asp His Leu Gin He Val Gin Glu Arg Asp Val Glu Thr Thr Gly 
35 40 45 

Thr Gly Phe Asp Asp Val His Leu Val His Asn Ala Leu Pro Glu Leu 
50 55 60 

Asp Tyr Asp Ala lie Asp Pro Ser He Asp Phe Leu Gly His Asp Leu 
65 70 75 80 

Ser Ala Pro He Phe He Glu Ser Met Thr Gly Gly His His Asn Thr 

85 90 95 

Thr Glu He Asn Arg Ala Leu Ala Arg Ala Ala Ser Glu Thr Gly He 

100 105 HO 

Ala Met Gly Leu Gly Ser Gin Arg Ala Gly Leu Glu Leu Asp Asp Glu 
115 120 125 

Arg Val Leu Glu Ser Tyr Thr Val Val Arg Asp Ala Ala Pro Asp Ala 
130 135 140 

Phe He Tyr Gly Asn Leu Gly Ala Ala Gin Leu Arg Glu Tyr Asp He 
145 150 155 160 

Glu Met Val Glu Gin Ala Val Glu Met He Asp Ala Asp Ala Leu Ala 

165 170 175 

Val His Leu Asn Phe Leu Gin Glu Ala Thr Gin Pro Glu Gly Asp Val 

180 185 190 

Asp Gly Arg Asn Cys Val Ala Ala He Glu Arg Val Ser Glu Ala Leu 
195 200 205 

Ser Val Pro He He Val Lys Glu Thr Gly Asn Gly He Ser Gly Glu 
2io 215 220 

Thr Ala Arg Glu Leu Thr Ala Ala Gly Val Asp Ala Leu Asp Val Ala 
225 230 235 240 

Gly Lys Gly Gly Thr Thr Trp Ser Gly He Glu Ala Tyr Arg Ala Ala 

245 250 255 

Ala Ala Asn Ala Pro Arg Gin Lys Gin He Gly Thr Leu Phe Arg Glu 
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260 265 270 

Trp Gly He Pro Thr Ala Ala Ser Thr He Glu Cys Val Ala Glu His 
275 280 285 

Asp Cys Val He Ala Ser Gly Gly Val Arg Thr Gly Leu Asp Val Ala 
290 295 300 

Lys Ala He Ala Leu Gly Ala Arg Ala Gly Gly Leu Ala Lys Pro Phe 
305 310 315 320 

Leu Lys Pro Ala Thr Asp Gly Pro Asp Ala Val He Glu Arg Val Gly 

325 330 335 

Asp Leu He Ala Glu Leu Arg Thr Ala Met Phe Val Thr Gly Ser Gly 

340 345 350 

Ser lie Asp Glu Leu Gin Gin Val Glu Tyr Val Leu His Gly Lys Thr 

355 360 365 \_. ' 

Arg Glu Tyr Val Glu Gin Arg Thr Ser Ser Glu 
370 375 

<210> 67 

<211> 317 

<212> PRT 

<213> Archaeoglobus fulgidus 



<400> 67 

Met Met Leu He His Lys Ala Leu Pro Glu Val Asp Tyr Trp Lys He 
15 10 15 

Asp Thr Glu He Glu Phe Phe Gly Lys Lys Leu Ser Phe Pro Leu Leu C 

20 25 30 

He Ala Ser Met Thr Gly Gly His Pro Glu Thr Lys Glu He Asn Ala 
35 40 45 

Arg Leu Gly Glu Ala Val Glu Glu Ala Gly lie Gly Met Gly Val Gly 
50 55 60 

Ser Gin Arg Ala Ala lie Glu Asp Glu Ser Leu Ala Asp Ser Phe Thr 
65 70 75 80 

Val Val Arg Glu Lys Ala Pro Asn Ala Phe Val Tyr Ala Asn lie Gly 

85 90 95 

Met Pro Gin Val He Glu Arg Gly Val Glu lie Val Asp Arg Ala Val 

100 105 110 

Glu Met He Asp Ala Asp Ala Val Ala lie His Leu Asn Tyr Leu Gin 

82 

DN3DOOID: «~WO 02099095A2J s 



WO 02/099095 PCTVEP02/06 1 7 1 



o 



115 120 125 

Glu Ala lie Gin Pro Glu Gly Asp Leu Asn Ala Glu Lys Gly Leu Glu 
130 135 140 

Val Leu Glu Glu Val Cys Arg Ser Val Lys Val Pro Val He Ala Lys 
145 150 155 160 

Glu Thr Gly Ala Gly He Ser Arg Glu Val Ala Val Met Leu Lys Arg 

165 170 175 

Ala Gly Val Ser Ala lie Asp Val Gly Gly Lys Gly Gly Thr Thr Phe 

180 185 190 

Ser Gly- Val Glu Val Tyr Arg Val Asn Asp Glu Val Ser Lys Ser Val 
195 200 205 

Gly He Asp Phe Trp Asp Trp Gly Leu Pro Thr Ala Phe Ser He Val 
210 215 220 

Asp Cys Arg Gly He Leu Pro Val He Ala Thr Gly Gly Leu Arg Ser 
225 230 235 240 

Gly Leu Asp Val Ala Lys Ser He Ala He Gly Ala Glu Leu Gly Ser 

245 250 255 

Ala Ala Leu Pro Phe Leu Arg Ala Ala Val Glu Ser Ala Glu Lys Val 

260 265 270 

Arg Glu Glu He Glu Tyr Phe Arg Arg Gly Leu Lys Thr Ala Met Phe 
275 280 285 

Leu Thr Gly Cys Lys Asn Val Glu Glu Leu Lys Gly Leu Lys Val Phe 
290 295 300 

Val Ser Gly Arg Leu Lys Glu Trp He Asp Phe Arg Gly 
305 310 315 

<210> 68 

<211> 370 

<212> PRT 

<213> Pyrococcus abyssi 



<400> 68 

Met Glu Glu Gin 
1 

Leu Thr Lys Asn 

20 

His Leu He His 



Thr He Leu Arg 
5 

Val Glu Ala His 
Lys Ser Leu Pro 



Lys Phe Glu His 
10 

Val Thr Asn Gly 
25 

Glu He Asp Lys 



He Lys His Cys 
15 

Phe Glu Asp Val 
30 

Asp Glu He Asp 
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35 



40 45 



Leu Ser Val Lys Phe Leu Gly Arg Lys Phe Asp Tyr Pro He Met He 
50 55 60 

Thr Gly Met Thr Gly Gly Thr Arg Lys Gly Glu He Ala Trp Arg He 
65 70 75 80 

Asn Arg Thr Leu Ala Gin Ala Ala Gin Glu Leu Asn He Pro Leu Gly 

85 90 95 

Leu Gly Ser Gin Arg Ala Met He Glu Lys Pro Glu Thr Trp Glu Ser 

100 105 HO 

Tvr Tyr Val Arg Asp Val Ala Pro Asp Val Phe Leu Val Gly Asn Leu 
115 120 125 

Gly Ala Pro Gin Phe Gly Arg Asn Ala Lys Lys Arg Tyr Ser Val Asp 
130 135 140 

Glu Val Leu Tyr Ala He Glu Lys He Glu Ala Asp Ala He Ala He 
145 150 155 160 

His Met Asn Pro Leu Gin Glu Ser He Gin Pro Glu Gly Asp Thr Thr 

165 170 175 

Phe Ser Gly Val Leu Glu Ala Leu Ala Glu He Thr Ser Thr He Asp 

180 185 190 

Tyr Pro Val He Ala Lys Glu Thr Gly Ala Gly Val Ser Lys Glu Val 
195 200 205 

Ala Val Glu Leu Glu Ala Val Gly Val Asp Ala He Asp He Ser Gly 
210 215 220 

Leu Glv Gly Thr Ser Trp Ser Ala Val Glu Tyr Tyr Arg Thr Lys Asp 
225 230 235 240 

Gly Glu Lys Arg Asn Leu Ala Leu Lys Phe Trp Asp Trp Gly He Lys 

245 250 255 

Thr Ala He Ser Leu Ala Glu Val Arg Trp Ala Thr Asn Leu Pro He 

260 265 270 

He Ala Ser Gly Gly Met Arg Asp Gly He Thr Met Ala Lys Ala Leu 
275 280 285 

Ala Met Gly Ala Ser Met Val Gly He Ala Leu Pro Val Leu Arg Pro 
290 295 300 

Ala Ala Lys Gly Asp Val Glu Gly Val He Arg He He Lys Gly Tyr 
305 310 315 320 

Ala Glu Glu He Arg Asn Val Met Phe Leu Val Gly Ala Arg Asn He 

325 330 335 



Lys 



Glu Leu Arg Lys Val Pro Leu Val He Thr Gly Phe Val Arg Glu 
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340 345 350 

Trp Leu Leu Gin Arg lie Asp Leu Asn Ser Tyr Leu Arg Ala Arg Phe 
355 360 365 

Lys Met 
370 

<210> 69 

<211> 371 

<212> PRT 

<213> Pyrococcus horikoshii 





<400> 69 

Met Lys Glu Glu Leu Thr lie Leu Arg Lys Phe Glu His lie Glu His 
15 10 15 

Cys Leu Lys Arg Asn Val Glu Ala His Val Ser Asn Gly Phe Glu Asp 

20 25 30 

Val Tyr Phe Val His Lys Ser Leu Pro Glu lie Asp Lys Asp Glu He 
35 40 45 

Asp Leu Thr Val Glu Phe Leu Gly Arg Lys Phe Asp Tyr Pro He Met 
50 55 60 

He Thr Gly Met Thr Gly Gly Thr Arg Arg Glu Glu He Ala Gly Lys 
65 70 75 80 

He Asn Arg Thr Leu Ala Met Ala Ala Glu Glu Leu Asn He Pro Phe 

85 90 95 

Gly Val Gly Ser Gin Arg Ala Met He Glu Lys Pro Glu Thr Trp Glu 

100 105 110 

Ser Tyr Tyr Val Arg Asp Val Ala Pro Asp He Phe Leu He Gly Asn 
115 120 125 

Leu Gly Ala Pro Gin Phe Gly Lys Asn Ala Lys Lys Arg Tyr Ser Val 
130 135 140 

Lys Glu Val Leu Tyr Ala lie Glu Lys He Glu Ala Asp Ala He Ala 
145 150 155 160 

He His Met Asn Pro Leu Gin Glu Ser Val Gin Pro Glu Gly Asp Thr 

165 170 175 

Thr Tyr Ala Gly Val Leu Glu Ala Leu Ala Glu He Lys Ser Ser He 

180 185 190 

Asn Tyr Pro Val He Ala Lys Glu Thr Gly Ala Gly Val Ser Lys Glu 
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195 200 205 

Val Ala He Glu Leu Glu Ser Val Gly He Asp Ala He Asp He Ser 
210 215 220 

Gly Leu Gly Gly Thr Ser Trp Ser Ala Val Glu Tyr Tyr Arg Ala Lys 
225 230 235 240 

Asp Ser Glu Lys Arg Lys He Ala Leu Lys Phe Trp Asp Trp Gly He 

245 250 255 

Lvs Thr Ala He Ser Leu Ala Glu Val Arg Trp Ala Thr Asn Leu Pro 

260 265 270 

lie He Ala Ser Gly Gly Met Arg Asp Gly Val Met Met Ala Lys Ala 
275 280 285 

Leu Ala Met Gly Ala Ser Leu Val Gly lie Ala Leu Pro Val Leu Arg 
290 295 300 

Pro Ala Ala Arg Gly Asp Val Glu Gly Val Val Arg He lie Arg Gly 
305 310 315 320 

Tvr Ala Glu Glu lie Lys Asn Val Met Phe Leu Val Gly Ala Arg Asn 

325 330 335 

He Arg Glu Leu Arg Arg Val Pro Leu Val He Thr Gly Phe Val Arg 

340 345 350 

Glu Trp Leu Leu Gin Arg He Asp Leu Asn Ser Tyr Leu Arg Ser Arg 
355 360 365 

Phe Lys His 
370 

<210> 70 

<211> 349 

<212> PRT 

<213> Methanobacterium thermoautotrophicum 



<400> 70 

Met He Ser Asp 
1 

Asp Val Glu Tyr 

20 



His Arg Ala lie 
35 



Asp Phe Leu Gly 



Arg Lys Leu Glu 
5 

Arg Lys Lys Thr 

Pro Glu lie Asn 

40 

Arg Glu Leu Ser 



His Leu lie Leu 
10 

Gly Phe Glu Asp 
25 

Lys Glu Lys He 
Ser Pro Val Met 

86 



Cys Ala Ser Cys 
15 

He Glu He Val 
30 

Asp lie Ser Leu 
45 

He Ser Ala lie 
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50 55 60 

Thr Gly Gly His Pro Ala Ser Met Lys lie Asn Arg Glu Leu Ala Arg 
65 70 75 80 

Ala Ala Glu Lys Leu Gly lie Ala Leu Gly Leu Gly Ser Gin Arg Ala 

85 90 95 

Gly Val Glu His Pro Glu Leu Glu Gly Thr Tyr Thr lie Ala Arg Glu 

100 105 110 

Glu Ala Pro Ser Ala Met Leu lie Gly Asn lie Gly Ser Ser His lie 
115 120 125 

Glu Tyr Ala Glu Arg Ala Val Glu Met lie Asp Ala Asp Ala Leu Ala 
130 135 140 

Val His Leu Asn Pro Leu Gin Glu Ser lie Gin Pro Gly Gly Asp Val 
145 150 155 160 

Asp Ser Ser Gly Ala Leu Glu Ser lie Ser Ala lie Val Glu Ser Val 

165 170 175 

Asp Val Pro Val Met Val Lys Glu Thr Gly Ala Gly lie Cys Ser Glu 

180 185 190 

Asp Ala lie Glu Leu Glu Ser Cys Gly Val Ser Ala lie Asp Val Ala 
195 200 205 

Gly Ala Gly Gly Thr Ser Trp Ala Ala Val Glu Thr Tyr Arg Ala Asp 
210 215 220 

Asp Arg Tyr Leu Gly Glu Leu Phe Trp Asp Trp Gly lie Pro Thr Ala 
225 230 235 240 

Ala Ser Thr Val Glu Val Val Glu Ser Val Ser lie Pro Val lie Ala 

245 250 255 

Ser Gly Gly lie Arg Ser Gly He Asp Ala Ala Lys Ala He Ser Leu 

260 265 270 

Gly Ala Glu Met Val Gly He Ala Leu Pro Val Leu Glu Ala Ala Gly 
275 280 285 

His Gly Tyr Arg Glu Val lie Lys Val He Glu Gly Phe Asn Glu Ala 
290 295 300 

Leu Arg Thr Ala Met Tyr Leu Ala Gly Ala Glu Thr Leu Asp Asp Leu 
305 310 315 320 

Lys Lys Ser Pro Val He He Thr Gly His Thr Gly Glu Trp Leu Asn 

325 330 335 

Gin Arg Gly Phe Glu Thr Lys Lys Tyr Ala Arg Arg Ser 

340 345 

<210> 71 



87 
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<211> 359 
<212> PRT 

<213> Methanococcus jannaschii 



Met Val Asn Asn 
1 

Phe Leu Cys Ser 

20 

Glu Asp lie Glu 
35 

Asp He Glu Thr 
50 

He He Val Ser 
65 

Asn Lys Asn He 



Val Gly Ser Gin 

100 

Tyr Ser He Val 
115 

Gly Ala Val Asn 
130 

Lys Ala He Glu 
145 

Pro Leu Gin Glu 



Leu Tyr Lys Leu 

180 

Asn He Pro Phe 
195 

Asp Ala Leu He 
210 

Gly Ser Gly Gly 
225 

Glu Glu Glu He 



Arg Asn Glu He 
5 

Tyr Cys Asn Val 



Leu lie His Lys 

40 

Glu He Glu Leu 
55 

Gly Met Thr Gly 
70 

Ala Lys Ala Val 
85 

Arg Ala Ala He 



Arg Asp Tyr Thr 

120 

Phe He Val Asp 
135 

Met He Asp Ala 
150 

He He Gin Pro 
165 

Lys Glu He He 



He Ala Lys Gin 

200 

Leu Lys Asp He 
215 

Thr Ser Trp Ala 
230 

Lys Arg Leu Ala 



Glu Val Arg Lys 
10 

Glu Tyr Glu Lys 
25 

Gly Thr Cys Gly 



Phe Gly Lys Lys 

60 

Gly His Ser Lys 
75 

Glu Glu Leu Gly 
90 

Val Asn Asp Glu 
105 

Asn Asn Leu Val 



Asp Trp Asp Glu 

140 

Asp Ala He Ala 
155 

Glu Gly Asp Leu 
170 

Ser Asn Tyr Lys 
185 

Val Gly Glu Gly 



Gly Phe Asp Ala 

220 

Lys Val Glu He 
235 

Glu Lys Phe Ala 



88 



Leu Glu His He 
15 

Thr Thr Leu Leu 
30 

He Asn Phe Asn 
45 

Leu Ser Ala Pro 



Ala Lys Glu He 

80 

Leu Gly Met Gly 
95 

Leu He Asp Thr 
110 

He Gly Asn Leu 
125 

Glu He He Asp 



He His Phe Asn 

160 

Asn Phe Lys Asn 
175 

Lys Ser Tyr Lys 
190 

Phe Ser Lys Glu 
205 

He Asp Val Gin 



Tyr Arg Val Lys 

240 

Asn Trp Gly He 
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245 250 255 

Pro Thr Ala Ala Ser lie Phe Glu Val Lys Ser Val Tyr Asp Gly lie 

260 265 270 

Val lie Gly Ser Gly Gly lie Arg Gly Gly Leu Asp lie Ala Lys Cys 
275 280 285 

lie Ala He Gly Cys Asp Cys Cys Ser Val Ala Leu Pro He Leu Lys 
290 295 300 

Ala Ser Leu Lys Gly Trp Glu Glu Val Val Lys Val Leu Glu Ser Tyr 
305 310 315 320 

He Lys Glu Leu Lys He Ala Met Phe Leu Val Gly Ala Glu Asn He 

325 330 335 

Glu Glu Leu Lys Lys Thr Ser Tyr He Val Lys Gly Thr Leu Lys Glu 

340 345 350 

Trp He Ser Gin Arg Leu Lys 
355 

<210> 72 

<211> 348 

<212> PRT 

<213> Thermoplasma acidophilum 



<400> 72 

Met He Gly Lys 
1 

Asp Val Ser Ser 

20 

Glu Ala Asp Pro 
35 

Phe Leu Gly Lys 
50 

Gly Gly Ala Glu 
65 

Ala Glu Arg Phe 



He Val Asp Arg 

100 

His Val Pro Leu 



Arg Lys Glu Glu 
5 

Phe His Asn Phe 



Glu Val Asn Tyr 

40 

Lys Leu Lys Phe 
55 

He Ala Lys Asn 
70 

Gly He Gly Met 
85 

Ser He Glu Asp 



Lys He Ala Asn 



His lie Arg He 
10 

Trp Asp Asp He 
25 

Asp Glu He Asp 



Pro Met He He 

60 

He Asn Arg Asn 
75 

Gly Val Gly Ser 
90 

Thr Tyr Ser Val 
105 

He Gly Ala Pro 



Ala Glu Asn Glu 
15 

Ser Leu Met His 
30 

Thr Ser Val Asp 
45 

Ser Ser Met Thr 



Leu Ala Val Ala 

80 

Met Arg Ala Ala 
95 

He Asn Glu Ser 
110 

Gin Leu Val Arg 
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115 



120 125 



Gin Asp Lys Asp Ala Val Ser Asn Arg Asp He Ala Tyr He Tyr Asp 
130 135 140 

Leu lie Lys Ala Asp Phe Leu Ala Val His Phe Asn Phe Leu Gin Glu 

ias 150 155 160 



145 

Met Val Gin Pro Glu Gly Asp Arg Asn Ser Lys Gly Val He Asp Arg 

165 170 175 

He Lys Asp Leu Ser Gly Ser Phe Asn He He Ala Lys Glu Thr Gly 

180 185 190 

Ser Gly Phe Ser Arg Arg Thr Ala Glu Arg Leu He Asp Ala Gly Val 
195 200 205 



Lys 



Ala He Glu Val Ser Gly Val Ser Gly Thr Thr Phe Ala Ala Val 



210 



215 220 \ 



<400> 73 

Met Ser Ser Arg Asp Cys Thr Val Asp Arg Glu Ala Ala Val Gin Lys 

90 



Glu Tyr Tyr Arg Ala Arg Lys Glu Asn Asn Leu Glu Lys Met Arg He 
225 230 235 240 

Glv Glu Thr Phe Trp Asn Trp Gly He Pro Ser Pro Ala Ser Val Tyr 

245 250 255 

Tyr Cys Ser Asp Leu Ala Pro Val He Gly Ser Gly Gly Leu Arg Asn 

260 265 270 

Gly Leu Asp Leu Ala Lys Ala He Ala Met Gly Ala Thr Ala Gly Gly 
275 280 285 

Phe Ala Arg Ser Leu Leu Lys Asp Ala Asp Thr Asp Pro Glu Met Leu 
290 295 300 

Met Lys Asn He Glu Leu He Gin Arg Glu Phe Arg Val Ala Leu Phe 
305 310 315 320 

Leu Thr Gly Asn Lys Asn Val Tyr Glu Leu Lys Phe Thr Lys Lys Val ( 

325 330 335 

He Val Asp Pro Leu Arg Ser Trp Leu Glu Ala Lys 

340 345 

<210> 73 
<211> 357 
<212> PRT 

<213> Leishmania major 
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Arg Lys Lys Asp 

20 

His Lys Arg Arg 
35 

Ala Leu Pro Glu 
50 

Met Gly Lys Arg 
65 

Gly Glu Ala His 



Glu Ala Glu Lys 

100 

Arg Tyr Ala Ser 
115 

Ser Val Pro Met 
130 

Phe Gly Pro Lys 
145 

Gly Leu Cys lie 



Gly Asp Thr Asn 

180 

Pro His lie Lys 
195 

Asp Tyr Glu Ser 
210 

Asp Val Ser Gly 
225 

Arg Gin Pro Tyr 



Asp lie Gly Val 

260 

Thr Val Asn Gly 
275 

Gly Met Asp Val 
290 

Ala Ala Met Pro 



His lie Asp lie 



Thr Ser lie Trp 

40 

Val Asp Leu Gin 
55 

lie Ser Phe Pro 
70 

Gly Arg Val lie 
85 

lie Pro Phe Gly 



Ala Val His Thr 

120 

Leu Ala Asn lie 
135 

Glu Val Asn Asn 
150 

His Leu Asn His 
165 

Phe Glu Gly Leu 



Val Pro Val Leu 

200 

Met Val Ala lie 
215 

Cys Gly Gly Thr 
230 

Lys Ala Glu Glu 
245 

Pro Thr Asp Val 



Asp Leu His Leu 

280 

Ala Lys Ala Leu 
295 

Phe Leu Ala Ala 



Cys Leu His Gin 
25 

Asn Lys Tyr Thr 



Lys lie Asp Thr 

60 

Phe Phe lie Ser 
75 

Asn Glu Asn Leu 
90 

Leu Gly Ser Met 
105 

Phe Asn Val Lys 



Gly Leu Val Gin 

140 

Leu Val Asn Ser 
155 

Thr Gin Glu Val 
170 

lie Glu Lys Leu 
185 

Val Lys Gly Val 



Lys Ala Ser Gly 

220 

Ser Trp Ala Trp 
235 

Glu Asn lie Gly 
250 

Cys Leu Arg Glu 
265 

lie Ala Gly Gly 



Met Met Gly Ala 

300 

Ala Leu Glu Ser 



15 

Asp Val Glu Pro 
30 

Leu Pro Tyr Lys 
45 

Ser Cys Glu Phe 



Ser Met Thr Gly 

80 

Ala Lys Ala Cys 
95 

Arg lie lie Asn 
110 

Glu Phe Cys Pro 
125 

Leu Asn Tyr Gly 



Val Arg Ala Asp 

160 

Cys Gin Pro Glu 
175 

Arg Gin Leu Leu 
190 

Gly His Gly He 
205 

Val Lys Tyr Val 



He Glu Gly Arg 

240 

Tyr Leu Leu Arg 
255 

Ser Ala Pro Leu 
270 

Gly He Arg Asn 
285 

Glu Tyr Ala Thr 



Ser Glu Ala Val 
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305 310 315 320 

Arg Ala Val lie Gin Arg Met Arg Gin Glu Leu Arg Val Ser Met Phe 

325 330 335 

Thr Cys Gly Ala Arg Asn lie Glu Glu Leu Arg Arg Met Lys Val lie 

340 345 350 

Glu Leu Gly His Leu 
355 

<210> 74 

<211> 398 

<212> PRT 

<213> Streptococcus pneumoniae 

( ) 



<400> 74 

Met Asn Asp Lys Thr Glu 
1 5 

Gly Phe Ala Thr Ser Gin 

20 

Ala Arg Gly lie Asp Pro 
35 

Glu Leu Ser lie Ala Pro 
50 

Ser Ala Ser Asp Ser lie 
65 70 

Met Val lie Val Ala Thr 

85 

Ala Val Phe Val His Gly 

100 

Phe Glu lie Lys Glu Ala 
115 

Ala Lys Leu His Val Glu 
130 

Ala Ser Asp lie Ala Lys 
145 150 

Gin Gly Ala Gly Ser Val 

165 

Met Ala Phe Asn Asn Asp 



92 



Val Asn Met Thr lie Gly lie Asp Lys lie 

10 15 

Tyr Val Leu Lys Leu Gin Asp Leu Ala Glu 
25 30 

Glu Lys Leu Ser Lys Gly Leu Leu Leu Lys 
40 45 

Leu Thr Glu Asp lie Val Thr Leu Ala Ala 
55 60 

Leu Thr Glu Gin Glu Arg Gin Glu Val Asp 

75 80 

Glu Ser Gly He Asp Gin Ser Lys Ala Ala / 

90 95 

Leu Leu Gly He Gin Pro Phe Ala Arg Ser 
105 110 

Cys Tyr Gly Ala Thr Ala Ala Leu His Tyr 
120 125 

Asn Ser Pro Glu Ser Lys Val Leu Val He 
135 140 

Tyr Gly He Glu Thr Pro Gly Glu Pro Thr 

155 160 

Ala Met Leu He Thr Gin Asn Pro Arg Met 

170 175 

Asn Val Ala Gin Thr Arg Asp He Met Asp 
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180 185 190 

Phe Trp Arg Pro Asn Tyr Ser Thr Thr Pro Tyr Val Asn Gly Val Tyr 
195 200 205 

Ser Thr Gin Gin Tyr Leu Asp Ser Leu Lys Thr Thr Trp Leu Glu Tyr 
210 215 220 

Gin Lys Arg Tyr Gin Leu Thr Leu Asp Asp Phe Ala Ala Val Cys Phe 
225 230 235 240 

His Leu Pro Tyr Pro Lys Leu Ala Leu Lys Gly Leu Lys Lys lie Met 

245 250 255 

Asp Lys Asn Leu Pro Gin Glu Lys Lys Asp Leu Leu Gin Lys His Phe 

260 265 270 

Asp Gin Ser He Leu Tyr Ser Gin Lys Val Gly Asn He Tyr Thr Gly 
275 280 285 

Ser Leu Phe Leu Gly Leu Leu Ser Leu Leu Glu Asn Thr Asp Ser Leu 
290 295 300 

Lys Ala Gly Asp Lys lie Ala Leu Tyr Ser Tyr Gly Ser Gly Ala Val 
305 310 315 320 

Ala Glu Phe Phe Ser Gly Glu Leu Val Glu Gly Tyr Glu Ala Tyr Leu 

325 330 335 

Asp Lys Asp Arg Leu Asn Lys Leu Asn Gin Arg Thr Ala Leu Ser Val 

340 345 350 

Ala Asp Tyr Glu Lys Val Phe Phe Glu Glu Val Asn Leu Asp Glu Thr 
355 360 365 

Asn Ser Ala Gin Phe Ala Gly Tyr Glu Asn Gin Asp Phe Ala Leu Val 
370 375 380 

Glu He Leu Asp His Gin Arg Arg Tyr Ser Lys Val Glu Lys 
^ 385 390 395 

<210> 75 

<211> 391 

<212> PRT 

<213> Streptococcus pyrogenes 



<400> 75 

Met Thr He Gly lie Asp Lys He Gly Phe Ala Thr Ser Gin Tyr Val 
15 10 15 

Leu Lys Leu Glu Asp Leu Ala Leu Ala Arg Gin Val Asp Pro Ala Lys 



93 
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Phe Ser Gin Gly 
35 

Glu Asp He He 
50 

Asp Glu Asp Arg 
65 

Ser Thr Asp Gin 



Gly He Gin Pro 

100 

Ser Ala Thr Ala 
115 

Pro Asp Ser Arg 
130 

Val Gly Ser Pro 
145 

Leu Val Thr Ala 



Ala Gin Thr Arg 

180 

Thr Pro Tyr Val 
195 

Leu Glu Thr Thr 
210 

Ser Asp Phe Ala 
225 

Leu Lys Gly Leu 



Arg Glu Lys Leu 

260 



Gin He Gly Asn 
275 

Leu Leu Glu Asn 
290 

Phe Ser Tyr Gly 
305 

Val Ala Gly Tyr 



Leu Leu He Glu 

40 

Thr Leu Ala Ala 
55 

Ala Lys He Asp 
70 

Ser Lys Ala Ser 
85 

Phe Ala Arg Ser 



Ala Leu Asp Tyr 

120 

Val Leu Val He 
135 

Gly Glu Ser Thr 
150 

Asp Pro Arg He 
165 

Asp He Met Asp 



Asp Gly He Tyr 

200 

Trp Gin Ala Tyr 
215 

Ala Val Cys Phe 
230 

Asn Asn He Met 
245 

He Glu Ala Phe 



He Tyr Thr Gly 

280 

Ser Lys Val Leu 
295 

Ser Gly Ala Val 
310 

Asp Lys Met Leu 



25 

Ser Phe Ser Val 



Ser Ala Ala Asp 

60 

Met Val He Leu 
75 

Ala He Tyr Val 
90 

Phe Glu Val Lys 
105 

Ala Lys Leu His 



Ala Ser Asp He 

140 

Gin Gly Ser Gly 
155 

Leu Ala Leu Asn 
170 

Phe Trp Arg Pro 
185 

Ser Thr Lys Gin 



Gin Lys Arg Glu 

220 

His He Pro Phe 
235 

Asp Asn Thr Val 
250 

Gin Ala Ser He 
265 

Ser Leu Tyr Leu 



Gin Ser Gly Asp 

300 

Ser Glu Phe Tyr 
315 

Met Thr Asn Arg 



94 



Ala Pro He Thr 
45 

Gin He Leu Thr 



Ala Thr Glu Ser 

80 

His His Leu Val 
95 

Gin Ala Cys Tyr 
110 

Val Ala Ser Lys 
125 

Ala Arg Tyr Gly 



Ser He Ala Leu 

160 

Glu Asp Asn Val 
175 

Asn Tyr Ser Phe 
190 

Tyr Leu Asn Cys 
205 

Asn Leu Gin Leu 



Pro Lys Leu Ala 

240 

Pro Pro Glu His 
255 

Thr Tyr Ser Lys 
270 

Gly Leu Leu Ser 
285 

Lys He Gly Phe 



Ser Gly Gin Leu 

320 

Gin Ala Leu Leu 



<W O Q 2099095A2 > > 



WO 02/099095 PCTYEP02/06171 





325 330 335 

Asp Gin Arg Thr Arg Leu Ser Val Ser Lys Tyr Glu Asp Leu Phe Tyr 

340 345 350 

Glu Gin Val Gin Leu Asp Asp Asn Gly Asn Ala Asn Phe Asp lie Tyr 
355 360 365 

Leu Thr Gly Lys Phe Ala Leu Thr Ala lie Lys Glu His Gin Arg lie 
370 375 380 



Tyr His Thr Asn Asp 


Lys Asn 


385 




390 


<210> 


76 




<211> 


383 




<212> 


PRT 




<213> 


Enterococcus 


f aecalis 



<400> 76 

Met Thr lie Gly lie Asp Lys lie Ser Phe Phe Val Pro Pro Tyr Tyr 
1 5 10 15 

lie Asp Met Thr Ala Leu Ala Glu Ala Arg Asn Val Asp Pro Gly Lys 

20 25 30 

Phe His lie Gly lie Gly Gin Asp Gin Met Ala Val Asn Pro lie Ser 
35 40 45 

Gin Asp lie Val Thr Phe Ala Ala Asn Ala Ala Glu Ala lie Leu Thr 
50 55 60 

Lys Glu Asp Lys Glu Ala lie Asp Met Val lie Val Gly Thr Glu Ser 
65 70 75 80 

Ser lie Asp Glu Ser Lys Ala Ala Ala Val Val Leu His Arg Leu Met 

85 90 95 

Gly lie Gin Pro Phe Ala Arg Ser Phe Glu lie Lys Glu Ala Cys Tyr 

100 105 110 

Gly Ala Thr Ala Gly Leu Gin Leu Ala Lys Asn His Val Ala Leu His 
115 120 125 

Pro Asp Lys Lys Val Leu Val Val Ala Ala Asp lie Ala Lys Tyr Gly 
130 135 140 

Leu Asn Ser Gly Gly Glu Pro Thr Gin Gly Ala Gly Ala Val Ala Met 
145 150 155 160 

Leu Val Ala Ser Glu Pro Arg lie Leu Ala Leu Lys Glu Asp Asn Val 
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165 170 175 

Met Leu Thr Gin Asp He Tyr Asp Phe Trp Arg Pro Thr Gly His Pro 

180 185 190 

Tvr Pro Met Val Asp Gly Pro Leu Ser Asn Glu Thr Tyr He Gin Ser 
y 195 200 205 

Phe Ala Gin Val Trp Asp Glu His Lys Lys Arg Thr Gly Leu Asp Phe 
210 215 220 

Ala Asp Tyr Ar.p Ala Leu Ala Phe His He Pro Tyr Thr Lys Met Gly 
225 230 235 240 

Lys Lys Ala Leu Leu Ala Lys He Ser Asp Gin Thr Glu Ala Glu Gin 

245 250 255 

Glu Arg He Leu Ala Arg Tyr Glu Glu Ser lie He Tyr Ser Arg Arg 

260 265 270 

Val Gly Asn Leu Tyr Thr Gly Ser Leu Tyr Leu Gly Leu He Ser Leu 
27 5 280 285 

Leu Glu Asn Ala Thr Thr Leu Thr Ala Gly Asn Gin He Gly Leu Phe 
290 295 300 

Ser Tyr Gly Ser Gly Ala Val Ala Glu Phe Phe Thr Gly Glu Leu Val 
305 310 315 320 

Ala Gly Tyr Gin Asn His Leu Gin Lys Glu Thr His Leu Ala Leu Leu 

325 330 335 

Asp Asn Arg Thr Glu Leu Ser He Ala Glu Tyr Glu Ala Met Phe Ala 

340 345 350 

Glu Thr Leu Asp Thr Asp He Asp Gin Thr Leu Glu Asp Glu Leu Lys 
355 360 365 

Tvr Ser He Ser Ala He Asn Asn Thr Val Arg Ser Tyr Arg Asn 
370 375 380 



<210> 


77 


<211> 


384 


<212> 


PRT 


<213> 


Enterococcus f aecium 



<400> 77 

Met Lys lie Gly 
1 

Leu Asp Met Thr 



lie Asp Arg Leu 
5 

Glu Leu Ala Glu 



Ser Phe Phe lie 
10 

Ser Arg Gly Asp 



Pro Asn Leu Tyr 
15 

Asp Pro Ala Lys 
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20 25 30 

Tyr His lie Gly He Gly Gin Asp Gin Met Ala Val Asn Arg Ala Asn 
35 40 45 

Glu Asp He lie Thr Leu Gly Ala Asn Ala Ala Ser Lys He Val Thr 
50 55 60 

Glu Lys Asp Arg Glu Leu He Asp Met Val He Val Gly Thr Glu Ser 
65 70 75 80 

Gly He Asp His Ser Lys Ala Ser Ala Val He He His His Leu Leu 

85 90 95 

Lys He Gin Ser Phe Ala Arg Ser Phe Glu Val Lys Glu Ala Cys Tyr 

100 105 110 

Gly Gly Thr Ala Ala Leu His Met Ala Lys Glu Tyr Val Lys Asn His 
115 120 125 

Pro Glu Arg Lys Val Leu Val He Ala Ser Asp He Ala Arg Tyr Gly 
130 135 140 

Leu Ala Ser Gly Gly Glu Val Thr Gin Gly Val Gly Ala Val Ala Met 
145 150 155 160 

Met He Thr Gin Asn Pro Arg He Leu Ser He Glu Asp Asp Ser Val 

165 170 175 

Phe Leu Thr Glu Asp He Tyr Asp Phe Trp Arg Pro Asp Tyr Ser Glu 

180 185 190 

Phe Pro Val Val Asp Gly Pro Leu Ser Asn Ser Thr Tyr He Glu Ser 
195 200 205 

Phe Gin Lys Val Trp Asn Arg His Lys Glu Leu Ser Gly Arg Gly Leu 
210 215 220 

^ s . Glu Asp Tyr Gin Ala He Ala Phe His He Pro Tyr Thr Lys Met Gly 

w 225 230 235 240 

Lys Lys Ala Leu Gin Ser Val Leu Asp Gin Thr Asp Glu Asp Asn Gin 

245 250 255 

Glu Arg Leu Met Ala Arg Tyr Glu Glu Ser He Arg Tyr Ser Arg Arg 

260 265 270 

He Gly Asn Leu Tyr Thr Gly Ser Leu Tyr Leu Gly Leu Thr Ser Leu 
275 280 285 

Leu Glu Asn Ser Lys Ser Leu Gin Pro Gly Asp Arg He Gly Leu Phe 
290 295 300 

Ser Tyr Gly Ser Gly Ala Val Ser Glu Phe Phe Thr Gly Tyr Leu Glu 
305 310 315 320 

Glu Asn Tyr Gin Glu Tyr Leu Phe Ala Gin Ser His Gin Glu Met Leu 
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325 330 335 

Asp Ser Arg Thr Arg He Thr Val Asp Glu Tyr Glu Thr He Phe Ser 

340 345 350 

Glu Thr Leu Pro Glu His Gly Glu Cys Ala Glu Tyr Thr Ser Asp Val 
355 360 365 

Pro Phe Ser He Thr Lys He Glu Asn Asp He Arg Tyr Tyr Lys He 
370 375 380 

<210> 78 

<2H> 388 

<212> PRT 

<213> Staphylococcus haemolyticus 



<400> 78 

Met Ser He Gly 
1 

Val Asp Met Ala 

20 

Phe Leu He Gly 
35 

Gin Asp He Val 
50 

Asp Asp Asp Lys 
65 

Ala He Asp Asn 



Gly Val Gin Pro 

100 



Ala Ala Thr Pro 
115 

Pro Asn Glu Lys 
130 

He Gin Ser Gly 
145 

Leu He Ser Asn 



Ala Tyr Thr Glu 



He Asp Lys He 
5 

Lys Leu Ala Glu 



He Gly Gin Thr 

40 

Ser Met Gly Ala 
55 

Lys His He Gly 
70 

Ala Lys Ala Ala 
85 

Phe Ala Arg Cys 



Ala He Gin Leu 

120 

Val Leu Val lie 
135 

Gly Glu Pro Thr 
150 

Asn Pro Ser He 
165 

Asp Val Tyr Asp 



Asn Phe Tyr Val 
10 

Ala Arg Gin Val 
25 

Gin Met Ala Val 



Asn Ala Ala Lys 

60 

Met Val He Val 
75 

Ala Val Gin He 
90 

Phe Glu Met Lys 
105 

Ala Lys Asp Tyr 



Ala Ser Asp Thr 

140 

Gin Gly Ala Gly 
155 

Leu Glu Leu Asn 
170 

Phe Trp Arg Pro 



Pro Lys Tyr Tyr 
15 

Asp Pro Asn Lys 
30 

Ser Pro Val Ser 
45 

Asp He lie Thr 



Ala Thr Glu Ser 

80 

His Asn Leu Leu 
95 

Glu Ala Cys Tyr 
110 

lie Glu Lys Arg 
125 

Ala Arg Tyr Gly 



Ala Val Ala Met 

160 

Asp Asp Ala Val 
175 

Thr Gly His Lys 
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180 185 190 

Tyr Pro Leu Val Ala Gly Ala Leu Ser Lys Asp Ala Tyr lie Lys Ser 
195 200 205 

Phe Gin Glu Ser Trp Asn Glu Tyr Ala Arg Arg Glu Asp Lys Thr Leu 
210 215 220 

Ser Asp Phe Glu Ser Leu Cys Phe His Val Pro Phe Thr Lys Met Gly 
225 230 235 240 

Lys Lys Ala Leu Asp Ser lie lie Asn Asp Ala Asp Glu Thr Thr Gin 

245 250 255 

Glu Arg Leu Thr Ser Gly Tyr Glu Asp Ala Val Tyr Tyr Asn Arg Tyr 

260 265 270 

Val Gly Asn lie Tyr Thr Gly Ser Leu Tyr Leu Ser Leu lie Ser Leu 
275 280 285 

Leu Glu Asn Arg Ser Leu Lys Gly Gly Gin Thr lie Gly Leu Phe Ser 
290 295 300 

Tyr Gly Ser Gly Ser Val Gly Glu Phe Phe Ser Ala Thr Leu Val Glu 
305 310 315 320 

Gly Tyr Glu Lys Gin Leu Asp lie Glu Gly His Lys Ala Leu Leu Asn 

325 330 335 

Glu Arg Gin Glu Val Ser Val Glu Asp Tyr Glu Ser Phe Phe Lys Arg 

340 345 350 

Phe Asp Asp Leu Glu Phe Asp His Ala Thr Glu Gin Thr Asp Asp Asp 
355 360 365 

Lys Ser lie Tyr Tyr Leu Glu Asn lie Gin Asp Asp lie Arg Gin Tyr 
370 375 380 

His lie Pro Lys 
385 

<210> 79 
<211> 388 
<212> PRT 

<213> Staphylococcus epidermis 



<400> 79 

Met Asn lie Gly lie Asp Lys lie Ser Phe Tyr Val Pro Lys Tyr Tyr 
1 5 10 15 

Val Asp Met Ala Lys Leu Ala Glu Ala Arg Gin Val Asp Pro Asn Lys 
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Phe Leu lie Gly 
35 

Gin Asp lie Val 
50 

Glu Glu Asp Lys 
65 

Ala lie Asp Asn 



Gly He Gin Pro 

100 

Ala Ala Thr Pro 
115 

Pro Asn Glu Lys 
130 

lie His Ser Gly 
145 

Met He Ser His 



Ala Tyr Thr Glu 

180 

Tyr Pro Leu Val 
195 

Phe Gin Glu Ser 
210 

Ala Asp Phe Ala 
225 

Gin Lys Ala Leu 



Asp Arg Leu Asn 

260 

Val Gly Asn He 
275 

Leu Glu Thr Arg 
290 

Tyr Gly Ser Gly 
305 

Gly Phe Lys Glu 



He Gly Gin Thr 

40 

Ser Met Gly Ala 
55 

Lys Asn He Gly 
70 

Ala Lys Ala Ala 
85 

Phe Ala Arg Cys 



Ala He Gin Leu 

120 

Val Leu Val He 
135 

Gly Glu Pro Thr 
150 

Asp Pro Ser He 
165 

Asp Val Tyr Asp 



Ala Gly Ala Leu 

200 

Trp Asn Glu Tyr 
215 

Ser Leu Cys Phe 
230 

Asp Ser He He 
245 

Ser Ser Tyr Gin 



Tyr Thr Gly Ser 

280 

Asp Leu Lys Gly 
295 

Ser Val Gly Glu 
310 

Gin Leu Asp Val 



25 

Glu Met Thr Val 



Asn Ala Ala Lys 

60 

Met Val lie Val 
75 

Ala Val Gin He 
90 

Phe Glu Met Lys 
105 

Ala Lys Asp Tyr 



Ala Ser Asp Thr 

140 

Gin Gly Ala Gly 
155 

Leu Lys Leu Asn 
170 

Phe Trp Arg Pro 
185 

Ser Lys Asp Ala 



Ala Arg Arg His 

220 

His Val Pro Phe 
235 

Asn His Ala Asp 
250 

Asp Ala Val Asp 
265 

Leu Tyr Leu Ser 



Gly Gin Thr He 

300 

Phe Phe Ser Gly 
315 

Glu Arg His Lys 



100 



Ser Pro Val Asn 
45 

Asp He He Thr 



Ala Thr Glu Ser 

80 

His His Leu Leu 
95 

Glu Ala Cys Tyr 
110 

Leu Ala Gin Arg 
125 

Ala Arg Tyr Gly 



Ala Val Ala Met 

160 

Asp Asp Ala Val 
175 

Thr Gly His Gin 
190 

Tyr He Lys Ser 
205 

Asn Lys Thr Leu 



Thr Lys Met Gly 

240 

Glu Thr Thr Gin 
255 

Tyr Asn Arg Tyr 
270 

Leu He Ser Leu 
285 

Gly Leu Phe Ser 



Thr Leu Val Asp 

320 

Ser Leu Leu Asn 
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325 330 335 

Asn Arg lie Glu Val Ser Val Asp Glu Tyr Glu His Phe Phe Lys Arg 

340 345 350 

Phe Asp Gin Leu Glu Leu Asn His Glu Leu Glu Lys Ser Asn Ala Asp 
355 360 365 

Arg Asp lie Phe Tyr Leu Lys Ser lie Asp Asn Asn lie Arg Glu Tyr 
370 375 380 

His lie Ala Glu 
385 

<210> 80 
<211> 388 
<212> PRT 

<213> Staphylococcus aureus 



<400> 80 

Met Thr lie Gly lie Asp Lys lie Asn Phe Tyr Val Pro Lys Tyr Tyr 
15 10 15 

Val Asp Met Ala Lys Leu Ala Glu Ala Arg Gin Val Asp Pro Asn Lys 

20 25 30 

Phe Leu lie Gly lie Gly Gin Thr Glu Met Ala Val Ser Pro Val Asn 
35 40 45 

Gin Asp He Val Ser Met Gly Ala Asn Ala Ala Lys Asp He He Thr 
50 55 60 

Asp Glu Asp Lys Lys Lys He Gly Met Val He Val Ala Thr Glu Ser 
65 70 75 80 

Ala Val Asp Ala Ala Lys Ala Ala Ala Val Gin He His Asn Leu Leu 

85 90 95 

Gly He Gin Pro Phe Ala Arg Cys Phe Glu Met Lys Glu Ala Cys Tyr 

100 105 110 

Ala Ala Thr Pro Ala He Gin Leu Ala Lys Asp Tyr Leu Ala Thr Arg 
115 120 125 

Pro Asn Glu Lys Val Leu Val He Ala Thr Asp Thr Ala Arg Tyr Gly 
130 135 140 

Leu Asn Ser Gly Gly Glu Pro Thr Gin Gly Ala Gly Ala Val Ala Met 
145 150 155 160 

Val He Ala His Asn Pro Ser He Leu Ala Leu Asn Glu Asp Ala Val 
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165 170 175 

Ala Tyr Thr Glu Asp Val Tyr Asp Phe Trp Arg Pro Thr Gly His Lys 

180 185 190 

Tyr Pro Leu Val Asp Gly Ala Leu Ser Lys Asp Ala Tyr He Arg Ser 
195 200 205 

Phe Gin Gin Ser Trp Asn Glu Tyr Ala Lys Arg Gin Gly Lys Ser Leu 
210 215 220 

Ala Asp Phe Ala Ser Leu Cys Phe His Val Pro Phe Thr Lys Met Gly 
225 230 235 240 

Lys Lys Ala Leu Glu Ser He He Asp Asn Ala Asp Glu Thr Thr Gin 

245 250 255 

Glu Arg Leu Arg Ser Gly Tyr Glu Asp Ala Val Asp Tyr Asn Arg Tyr 

260 265 270 

Val Gly Asn He Tyr Thr Gly Ser Leu Tyr Leu Ser Leu He Ser Leu 
275 280 285 

Leu Glu Asn Arg Asp Leu Gin Ala Gly Glu Thr He Gly Leu Phe Ser 
290 295 300 

Tvr Gly Ser Gly Ser Val Val Glu Phe Tyr Ser Ala Thr Leu Val Val 
305 310 315 320 

Gly Tyr Lys Asp His Leu Asp Gin Ala Ala His Lys Ala Leu Leu Asn 

325 330 335 

Asn Arg Thr Glu Val Ser Val Asp Ala Tyr Glu Thr Phe Phe Lys Arg 

340 345 350 

Phe Asp Asp Val Glu Phe Asp Glu Glu Gin Asp Ala Val His Glu Asp 
355 360 365 

Arg His He Phe Tyr Leu Ser Asn He Glu Asn Asn Val Arg Glu Tyr 
370 375 380 

His Arg Pro Glu 
385 

<210> 81 
<211> 389 
<212> PRT 

<213> Staphylococcus carnosus 



<400> 81 

Met Thr He Gly He Asp Gin Leu Asn Phe Tyr He Pro Asn Phe Tyr 
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1 5 10 15 

Val Asp Met Ala Glu Leu Ala Glu Ala Arg Gly Val Asp Pro Asn Lys 

20 25 30 

Phe Leu He Gly He Gly Gin Ser Gin Met Ala Val Ser Pro Val Ser 
35 40 45 

Gin Asp He Val Ser Met Gly Ala Asn Ala Ala Gin Pro He Leu Ser 
50 55 60 

Glu Gin Asp Lys Lys Asp He Thr Met Val He Val Ala Thr Glu Ser 
65 70 75 80 

Ala He Asp Ser. Ala Lys Ala Ser Ala Val Gin He His His Leu Leu 

85 90 95 

Gly He Gin Pro Phe Ala Arg Cys Phe Glu Met Lys Glu Ala Cys Tyr 

100 105 no 

Ala Ala Thr Pro Ala He Gin Leu Ala Lys Asp Tyr Leu Val Pro Arg 
115 120 125 

Pro Lys Glu Lys Val Leu Val He Ala Ser Asp Thr Ala Arg Tyr Gly 
i30 135 140 

Leu Asn Ser Gly Gly Glu Pro Thr Gin Gly Ala Gly Ala Val Ala Met 
145 150 155 160 

Val He Ser His Asn Pro Ser He Leu Glu Leu His Asp Asp Ser Val 

165 170 175 

Ala Tyr Thr Glu Asp Val Tyr Asp Phe Trp Arg Pro Ser Gly Glu He 

180 185 190 

Tyr Pro Leu Val Ala Gly Lys Leu Ser Lys Asp Ala Tyr He Lys Ser 
i9 5 200 205 

£2 Phe Gin Glu Ser Trp Asn Glu Tyr Ala Lys Arg His His Lys Ser Leu 

210 215 220 

Ser Asp Phe Ala Ala Leu Cys Phe His Val Pro Phe Thr Lys Met Glv 
225 230 235 240 

Gin Lys Ala Leu Asp Ser He Leu Thr Asp Ser Ala Ser Glu Asp Thr 

245 250 255 

Gin Ala Arg Leu Asn Glu Gly Tyr Lys Ser Ala Thr Asp Tyr Asn Arg 

260 265 270 

Tyr Val Gly Asn Val Tyr Thr Gly Ser Leu Tyr Leu Ser Leu lie Ser 
275 280 285 

Leu Leu Glu Asn His Lys Leu Asn Gly Gly Asp Asn He Gly Leu Phe 
290 295 300 



Ser Tyr Gly Ser Gly Ser Val Gly Glu Phe Phe Ser Ala Thr 



Leu Val 
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305 310 315 320 

Asp Asn Tyr Gin Asp His Leu Asp Val Lys Ala His Lys Ala Met Leu 

325 330 335 

Asp Asn Arg Lys Ala Leu Ser Val Glu Glu Tyr Glu Lys Phe Phe Asn 

340 345 350 

Arg Phe Asp Asn Leu Glu Phe Asp Thr Glu Thr Glu Leu Glu Val Glu 
355 360 365 

Pro Lys Gly Asn Phe Tyr Leu Lys Glu lie Ser Asp Asn lie Arg Tyr 
370 375 380 

Tyr Asp Thr Val Lys 
385 

<210> 82 
<211> 389 
<212> PRT 

<213> Streptomyces sp. CL190 



<400> 82 

Met Ser lie Ser lie Gly lie His Asp Leu Ser Phe Ala Thr Thr Glu 
15 10 15 

Phe Val Leu Pro His Thr Ala Leu Ala Glu Tyr Asn Gly Thr Glu lie 

20 25 30 

Gly Lys Tyr His Val Gly lie Gly Gin Gin Ser Met Ser Val Pro Ala 
35 40 45 

Ala Asp Glu Asp lie Val Thr Met Ala Ala Thr Ala Ala Arg Pro lie 
50 55 60 

lie Glu Arg Asn Gly Lys Ser Arg lie Arg Thr Val Val Phe Ala Thr 
65 70 75 80 

Glu Ser Ser lie Asp Gin Ala Lys Ala Gly Gly Val Tyr Val His Ser 

85 90 95 

Leu Leu Gly Leu Glu Ser Ala Cys Arg Val Val Glu Leu Lys Gin Ala 

100 105 110 

Cys Tyr Gly Ala Thr Ala Ala Leu Gin Phe Ala lie Gly Leu Val Arg 
115 120 125 

Arg Asp Pro Ala Gin Gin Val Leu Val lie Ala Ser Asp Val Ser Lys 
130 135 140 

Tyr Glu Leu Asp Ser Pro Gly Glu Ala Thr Gin Gly Ala Ala Ala Val 
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145 150 155 160 

Ala Met Leu Val Gly Ala Asp Pro Ala Leu Leu Arg lie Glu Glu Pro 

165 170 175 

Ser Gly Leu Phe Thr Ala Asp Val Met Asp Phe Trp Arg Pro Asn Tyr 

180 185 190 

Leu Thr Thr Ala Leu Val Asp Gly Gin Glu Ser lie Asn Ala Tyr Leu 
195 200 205 

Gin Ala Val Glu Gly Ala Trp Lys Asp Tyr Ala Glu Gin Asp Gly Arg 
210 215 220 

Ser Leu Glu Glu Phe Ala Ala Phe Val Tyr His Gin Pro Phe Thr Lys 
225 230 235 240 

s~ Met Ala Tyr Lys Ala His Arg His Leu Leu Asn Phe Asn Gly Tyr Asp 

V_/ 245 250 255 

Thr Asp Lys Asp Ala lie Glu Gly Ala Leu Gly Gin Thr Thr Ala Tyr 

260 265 270 

Asn Asn Val lie Gly Asn Ser Tyr Thr Ala Ser Val Tyr Leu Gly Leu 
275 280 285 

Ala Ala Leu Leu Asp Gin Ala Asp Asp Leu Thr Gly Arg Ser He Gly 
290 295 300 

Phe Leu Ser Tyr Gly Ser Gly Ser Val Ala Glu Phe Phe Ser Gly Thr 
305 310 315 320 

Val Val Ala Gly Tyr Arg Glu Arg Leu Arg Thr Glu Ala Asn Gin Glu 

325 330 335 

Ala He Ala Arg Arg Lys Ser Val Asp Tyr Ala Thr Tyr Arg Glu Leu 

340 345 350 

His Glu Tyr Thr Leu Pro Ser Asp Gly Gly Asp His Ala Thr Pro Val 
355 360 365 

Gin Thr Thr Gly Pro Phe Arg Leu Ala Gly He Asn Asp His Lys Arg 
370 375 380 

He Tyr Glu Ala Arg 
385 

<210> 83 
<211> 389 
<212> PRT 

<213> Streptomyces griseolosporeus 
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<400> 83 

Met Pro Leu Ala He Gly He His Asp Leu Ser Phe Ala Thr Gly Glu 
! 5 10 15 

Phe Val Leu Pro His Thr Ala Leu Ala Ala His Asn Gly Thr Glu He 

20 25 30 

Gly Lys Tyr His Ala Gly He Gly Gin Glu Ser Met Ser Val Pro Ala 
35 40 

Ala Asp Glu Asp He Val Thr Leu Ala Ala Thr Ala Ala Ala Pro He 
50 55 60 

Val Ala Arg His Gly Ser Asp Arg He Arg Thr Val Val Leu Ala Thr 

65 70 75 80 

Glu Ser Ser He Asp Gin Ala Lys Ser Ala Gly Val Tyr Val His Ser f 

85 90 95 

Leu Leu Gly Leu Pro Ser Ala Thr Arg Val Val Glu Leu Lys Gin Ala 

100 105 110 

Cys Tyr Gly Ala Thr Ala Gly Leu Gin Phe Ala He Gly Leu Val Gin 
115 120 125 

Arg Asp Pro Ala Gin Gin Val Leu Val He Ala Ser Asp Val Ser Lys 
130 135 140 

Tyr Asp Leu Asp Ser Pro Gly Glu Ala Thr Gin Gly Ala Ala Ala Val 

150 155 lbu 



145 



Ala Met Leu Val Gly Ala Asp Pro Gly Leu Val Arg He Glu Asp Pro 

165 l^O 175 

Ser Gly Leu Phe Thr Val Asp Val Met Asp Phe Trp Arg Pro Asn Tyr 

180 185 190 

Ara Thr Thr Ala Leu Val Asp Gly Gin Glu Ser He Gly Ala Tyr Leu 
195 200 205 

Gin Ala Val Glu Gly Ala Trp Lys Asp Tyr Ser Glu Arg Gly Gly His 
210 2i5 220 

Ser Leu Glu Gin Phe Ala Ala Phe Cys Tyr His Gin Pro Phe Thr Lys 
225 230 235 

Met Ala His Lys Ala His Arg His Leu Leu Asn Tyr Cys Ser His Asp 

245 25 0 25 5 

He His His Asp Asp Val Thr Arg Ala Val Gly Arg Thr Thr Ala Tyr 

260 265 270 

Asn Arg Val He Gly Asn Ser Tyr Thr Ala Ser Val Tyr Leu Gly Leu 
275 280 285 

Ala Ala Leu Leu Asp Gin Ala Asp Asp Leu Thr Gly Glu Arg He Gly 



( 
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29 <> 295 300 

Phe Leu Ser Tyr Gly Ser Gly Ser Val Ala Glu Phe Phe Gly Gly He 
305 310 315 320 

Val Val Ala Gly Tyr Arg Asp Arg Leu Arg Thr Ala Ala Asn He Glu 

325 330 335 

Ala Val Ser Arg Arg Arg Pro He Asp Tyr Ala Gly Tyr Arg Glu Leu 

340 345 350 

His Glu Trp Ala Phe Pro Ala Arg Arg Gly Ala His Ser Thr Pro Gin 
355 360 365 

Gin Thr Thr Gly Pro Phe Arg Leu Ser Gly He Ser Gly His Lys Arg 
370 375 380 

Leu Tyr Arg Ala Cys 
385 

<210> 84 
<211> 407 
<212> PRT 

<213> Borrelia burgdorferi 



<400> 84 

Met Arg He Gly He Ser Asp lie Arg lie Phe Leu Pro Leu Asn Tyr 
15 10 15 

Leu Asp Phe Ser Val Leu Leu Glu Asn Pro Leu Tyr Phe Ser Asn Glu 

20 25 30 

Val Phe Phe Lys Lys He Asn Arg Ala He Asp Ala Thr Leu Gin Lys 
35 40 45 

Gly Phe Arg Phe Thr Ser Pro Asn Glu Asp Ser Val Thr Met Ala Ser 
50 55 60 

Ser Ala Val Lys Leu lie Phe Asp Asn Asn Asn Leu Asp Leu Ser Lys 
65 70 75 80 

lie Arg lie Leu Leu Gly Gly Thr Glu Thr Gly Val Asp His Ser Lys 

85 90 95 

Ala lie Ser Ser Tyr Val Phe Gly Ala Leu Lys Gin Ser Gly lie Cys 

100 105 HO 

Leu Gly Asn Asn Phe Leu Thr Phe Gin Val Gin His Ala Cys Ala Gly 
H5 120 125 

Ala Ala Met Ser Leu His Thr Val Ala Ser Val Leu Ser His Ser Asn 
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130 

Asn Ser Glu Tyr 
145 

Asn Leu Thr Thr 



Leu lie Glu Lys 

180 

Gly Val Tyr Thr 
195 

Val Glu Ala Lys 
210 

Ala Asn Glu Asn 
225 

Met Lys Asp Leu 



Ala Lys Met Pro 

260 

Ser Asp Asp Glu 
275 

Tyr Asp Gly Val 
290 

Ser He Phe Leu 
305 

Lys Asp He Thr 



Asn He Met He 

340 



Val He Lys Leu 
355 

Ala Asn Phe Glu 
370 

Gly Glu Ser Arg 
385 

Arg Val Tyr Gly 



Gly He Val Phe 
150 

Ala Glu He Thr 
165 

Asn Pro Lys He 



Asp Asp Val Asp 

200 



Val Arg Gly Gin 
215 

Ala Leu Arg Asp 
230 

Phe Ser Asn Tyr 
245 

He Asp Ser Met 



Ser Val Arg Asn 

280 

Glu Ala Ala Met 
295 

Ser Leu Ala Phe 
310 

Gly Glu Lys He 
325 

He Tyr Glu Leu 



Trp Asp Leu Glu 

360 



Glu Tyr Lys Asp 
375 

Gly Phe Tyr Leu 
390 

Tyr Arg Ala 
405 



140 

Ser Ser Asp He 
155 

Gin Gly Ala Gly 
170 

Leu Ser He Asn 
185 

Asp Phe Phe Arg 



Tyr Ser Val Glu 

220 



Phe Ala Phe Lys 
235 

Arg Phe Val Leu 
250 

His Tyr He Leu 
265 

Ala Tyr Leu Glu 



Glu Val Gly Asn 

300 

Tyr Leu Lys Arg 
315 

Leu Phe Cys Ser 
330 

Thr He Glu Lys 
345 

Gly Leu lie Lys 



Phe Phe Gin Asn 

380 



Lys Glu Leu Arg 
395 



Ala His Tyr Ser 

160 

Ala Thr Ala He 
175 

Leu Ser Glu Phe 
190 

Pro Phe Gly Ser 
205 

Cys Tyr Asn Asn 



Lys Gin Leu Ser 

240 

His Val Pro Phe 
255 

Lys Lys Tyr Tyr 
270 

Ser He Asp Phe 
285 

Leu Tyr Thr Gly 



Val Phe Ser Lys 

320 

Tyr Gly Ser Gly 
335 

Ser Ala Phe Asp 
350 

Asn Arg Asn Asn 
365 

Lys He He Pro 



Asn Asp Gly Tyr 

400 



<210> 85 



<211> 317 
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<212> PRT 

<213> Streptococcus pneumoniae 



c 



<400> 85 

Met Asp Arg Glu Pro Val Thr Val Arg Ser Tyr Ala Asn lie Ala lie 
15 10 15 

lie Lys Tyr Trp Gly Lys Lys Lys Glu Lys Glu Met Val Pro Ala Thr 

20 25 30 

Ser Ser lie Ser Leu Thr Leu Glu Asn Met Tyr Thr Glu Thr Thr Leu 
35 40 45 

Ser Pro Leu Pro Ala Asn Val Thr Ala Asp Glu Phe Tyr lie Asn Gly 
50 55 60 

Gin Leu Gin Asn Glu Val Glu His Ala Lys Met Ser Lys lie lie Asp 
65 70 75 80 

Arg Tyr Arg Pro Ala Gly Glu Gly Phe Val Arg lie Asp Thr Gin Asn 

85 90 95 

Asn Met Pro Thr Ala Ala Gly Leu Ser Ser Ser Ser Ser Gly Leu Ser 

100 105 110 

Ala Leu Val Lys Ala Cys Asn Ala Tyr Phe Lys Leu Gly Leu Asp Arg 
115 120 125 

Ser Gin Leu Ala Gin Glu Ala Lys Phe Ala Ser Gly Ser Ser Ser Arg 
130 135 140 

Ser Phe Tyr Gly Pro Leu Gly Ala Trp Asp Lys Asp Ser Gly Glu lie 
145 150 155 160 

Tyr Pro Val Glu Thr Asp Leu Lys Leu Ala Met lie Met Leu Val Leu 

165 170 175 

Glu Asp Lys Lys Lys Pro lie Ser Ser Arg Asp Gly Met Lys Leu Cys 

180 185 190 

Val Glu Thr Ser Thr Thr Phe Asp Asp Trp Val Arg Gin Ser Glu Lys 
195 200 205 

Asp Tyr Gin Asp Met Leu lie Tyr Leu Lys Glu Asn Asp Phe Ala Lys 
210 215 220 

He Gly Glu Leu Thr Glu Lys Asn Ala Leu Ala Met His Ala Thr Thr 
225 230 235 240 

Lys Thr Ala Ser Pro Ala Phe Ser Tyr Leu Thr Asp Ala Ser Tyr Glu 

245 250 255 

Ala Met Ala Phe Val Arg Gin Leu Arg Glu Lys Gly Glu Ala Cys Tyr 
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260 265 270 

Phe Thr Met Asp Ala Gly Pro Asn Val Lys Val Phe Cys Gin Glu Lys 
275 280 285 

Asp Leu Glu His Leu Ser Glu He Phe Gly Gin Arg Tyr Arg Leu He 
290 295 300 

Val Ser Lys Thr Lys Asp Leu Ser Gin Asp Asp Cys Cys 
305 310 315 

<210> 86 

<211> 314 

<212> PRT 

<213> Streptococcus pyrogenes 



<400> 86 

Met Asp Pro Asn 
1 

He Lys Tyr Trp 

20 

Ser Ser He Ser 
35 

Ser Phe Leu Pro 
50 

Val Leu Gin Asn 
65 

Gin Phe Arg Gin 



Val He Thr Val 
5 

Gly Lys Glu Asn 



Leu Thr Leu Glu 

40 

Asp Thr Ala Thr 
55 

Asp Glu Glu His 
70 

Pro Gly Gin Ala 
85 



Thr Ser Tyr Ala 
10 

Gin Ala Lys Met 
25 

Asn Met Phe Thr 



Ser Asp Gin Phe 

60 

Thr Lys He Ser 
75 

Phe Val Lys Met 
90 



Asn He Ala He 
15 

He Pro Ser Thr 
30 

Thr Thr Ser Val 
45 

Tyr He Asn Gly 



Thr He He Asp 

80 

Glu Thr Gin Asn 
95 



Asn Met Pro Thr 

100 

Ala Leu Val Lys 
115 

Lys Ala Leu Ala 
130 

Ser Phe Phe Gly 
145 

Tyr Lys Val Glu 



Asn Ala Ala Lys 



Ala Ala Gly Leu 



Ala Cys Asp Gin 

120 

Gin Lys Ala Lys 
135 

Pro Val Ala Ala 
150 

Thr Asp Leu Lys 
165 

Lys Pro He Ser 



Ser Ser Ser Ser 
105 

Leu Phe Asp Thr 



Phe Ala Ser Gly 

140 

Trp Asp Lys Asp 
155 

Met Ala Met He 
170 

Ser Arg Glu Gly 



Ser Gly Leu Ser 
110 

Gin Leu Asp Gin 
125 

Ser Ser Ser Arg 



Ser Gly Ala He 

160 

Met Leu Val Leu 
175 

Met Lys Leu Cys 
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180 185 190 

Arg Asp Thr Ser Thr Thr Phe Asp Glu Trp Val Glu Gin Ser Ala lie 
195 200 205 

Asp Tyr Gin His Met Leu Thr Tyr Leu Lys Thr Asn Asn Phe Glu Lys 
210 215 220 

Val Gly Gin Leu Thr Glu Ala Asn Ala Leu Ala Met His Ala Thr Thr 
225 230 235 240 

Lys Thr Ala Asn Pro Pro Phe Ser Tyr Leu Thr Lys Glu Ser Tyr Gin 

245 250 255 

Ala Met Glu. Ala Val Lys Glu Leu Arg Gin Glu Gly Phe Ala Cys Tyr 

260 265 270 

Phe Thr Met Asp Ala Gly Pro Asn Val Lys Val Leu Cys Leu Glu Lys 
275 280 285 

Asp Leu Ala Gin Leu Ala Glu Arg Leu Gly Lys Asn Tyr Arg lie lie 
290 295 300 

Val Ser Lys Thr Lys Asp Leu Pro Asp Val 
305 310 

<210> 87 

<211> 331 

<212> PRT 

<213> Enterococcus faecalis 



<400> 87 

Met Leu Ser Gly Lys Ala Arg Ala His Thr Asn He Ala Leu He Lys 
1 5 10 15 

Tyr Trp Gly Lys Ala Asn Glu Glu Tyr lie Leu Pro Met Asn Ser Ser 

20 25 30 

Leu Ser Leu Thr Leu Asp Ala Phe Tyr Thr Glu Thr Thr Val Thr Phe 
35 40 45 

Asp Ala His Tyr Ser Glu Asp Val Phe He Leu Asn Gly He Leu Gin 
50 55 60 

Asn Glu Lys Gin Thr Lys Lys Val Lys Glu Phe Leu Asn Leu Val Arg 
65 70 75 80 

Gin Gin Ala Asp Cys Thr Trp Phe Ala Lys Val Glu Ser Gin Asn Phe 

85 90 95 

Val Pro Thr Ala Ala Gly Leu Ala Ser Ser Ala Ser Gly Leu Ala Ala 
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15 10 15 

Tyr Trp Gly Lys Lys Asn Glu Glu Leu He Leu Pro Met Asn Asn Ser 

20 25 30 

Leu Ser Leu Thr Leu Asp Ala Phe Tyr Thr Glu Thr Glu Val He Phe 
35 40 45 

Ser Asp Ser Tyr Met Val Asp Glu Phe Tyr Leu Asp Gly Thr Leu Gin 
50 55 60 

Asp Glu Lys Ala Thr Lys Lys Val Ser Gin Phe Leu Asp Leu Phe Arg 
65 70 75 80 

Lys Glu Ala Gly Leu Ser Leu Lys Ala Ser Val He Ser Gin Asn Phe 

85 90 95 

Val Pro Thr Ala Ala Gly Leu Ala Ser Ser Ala Ser Gly Leu Ala Ala 

100 105 110 

Leu Ala Gly Ala Cys Asn Thr Ala Leu Lys Leu Gly Leu Asp Asp Leu 
115 120 125 

Ser Leu Ser Arg Phe Ala Arg Arg Gly Ser Gly Ser Ala Cys Arg Ser 
130 135 140 

He Phe Gly Gly Phe Val Glu Trp Glu Lys Gly His Asp Asp Leu Ser 
145 150 . 155 160 

Ser Tyr Ala Lys Pro Val Pro Ser Asp Ser Phe Glu Asp Asp Leu Ala 

165 170 175 

Met Val Phe Val Leu He Asn Asp Gin Lys Lys Glu Val Ser Ser Arg 

180 185 190 

Asn Gly Met Arg Arg Thr Val Glu Thr Ser Asn Phe Tyr Gin Gly Trp 
195 200 205 

o 

Leu Asp Ser Val Glu Gly Asp Leu Tyr Gin Leu Lys Gin Ala He Lys 
210 215 220 

Thr Lys Asp Phe Gin Leu Leu Gly Glu Thr Met Glu Arg Asn Gly Leu 
225 230 235 240 

Lys Met His Gly Thr Thr Leu Ala Ala Gin Pro Pro Phe Thr Tyr Trp 

245 250 255 

Ser Pro Asn Ser Leu Lys Ala Met Asp Ala Val Arg Gin Leu Arg Lys 

260 265 270 

Gin Gly He Pro Cys Tyr Phe Thr Met Asp Ala Gly Pro Asn Val Lys 
275 280 285 

Val Leu Val Glu Asn Ser His Leu Ser Glu Val Gin Glu Thr Phe Thr 
290 295 300 

Lys Leu Phe Ser Lys Glu Gin Val He Thr Ala His Ala Gly Pro Gly 



113 



BNSDOCID: <WO 0209909 5A2_I_> 



WO 02/099095 



PCT/EP02/06171 



305 310 

lie Ala lie He Glu 

325 

<210> 89 
<211> 327 
<212> PRT 

<213> Staphylococcus haemolyticus 



315 320 



<400> 89 

Met Lys Lys Ser 
1 

Lys Tyr Trp Gly 

20 

Ser Leu Ser Val 
35 

Phe Asp Glu Thr 
50 

Val Asn Ala Lys 
65 

Arg Lys Glu Ala 



Phe Val Pro Thr 

100 



Ala Leu Ala Gly 
115 

Lys Asp Leu Ser 
130 

Ser He Tyr Gly 
145 

Thr Ser Phe Ala 



Ala Met Val Phe 

180 



Arg Ser Gly Met 
195 



Trp Leu Asp Asn 



Gly Lys Ala Arg 
5 

Lys Ala Asp Glu 



Thr Leu Asp Arg 

40 

Leu Thr Glu Asp 
55 

Glu Ser Ala Lys 
70 

Gly He Ser His 
85 

Ala Ala Gly Leu 



Ala Cys Asn Glu 

120 



Arg Leu Ala Arg 
135 

Gly Phe Ala Glu 
150 

His Arg Val Glu 
165 

Val Val He Asn 



Ser Leu Thr Arg 

200 

Val Glu Pro Asp 



Ala His Thr Asn 
10 

Ala Leu He He 
25 

Phe Tyr Thr Glu 



Gin Leu He Leu 

60 



He Gin Arg Tyr 
75 

Glu Ala Leu He 
90 

Ala Ser Ser Ala 
105 

Ala Leu Gin Leu 



Arg Gly Ser Gly 

140 

Trp Glu Lys Gly 
155 

Ala Asp Gly Trp 
170 

Asn Lys Ser Lys 
185 

Asp Thr Ser Arg 



Leu Lys Glu Thr 



114 



He Ala Leu He 
15 

Pro Met Asn Asn 
30 

Thr Arg Val Thr 
45 

Asn Gly Glu Ala 



Met Glu Met He 

80 

Glu Ser Glu Asn 
95 

Ser Ala Tyr Ala 
110 

Gly Leu Ser Asp 
125 

Ser Ala Ser Arg 



Asn Asp Asp Glu 

160 



Glu Asn Glu Leu 
175 

Lys Val Ser Ser 
190 

Phe Tyr Gin Tyr 
205 

Lys Glu Ala He 
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210 215 220 

Ala Gin Lys Asp Phe Lys Arg Met Gly Glu Val He Glu Ala Asn Gly 
225 230 235 240 

Leu Arg Met His Ala Thr Asn Leu Gly Ala Gin Pro Pro Phe Thr Tyr 

245 250 255 

Leu Val Pro Glu Ser Tyr Asp Ala Met Arg He Val His Glu Cys Arg 

260 265 270 

Glu Ala Gly Leu Pro Cys Tyr Phe Thr Met Asp Ala Gly Pro Asn Val 
275 280 285 

Lys Val Leu He Glu Lys Lys Asn Gin Gin Ala He Val Asp Lys Phe 
290 295 300 

Leu Gin Glu Phe Asp Gin Ser Gin He He Thr Ser Asp He Thr Gin 
305 310 315 320 

Ser Gly Val Glu He He Lys 

325 

<210> 90 
<211> 327 
<212> PRT 

<213> Staphylococcus epidermis 



<400> 90 

Met Val Lys Ser Gly Lys Ala Arg Ala His Thr Asn He Ala Leu He 
1 5 10 15 

G 

Lys Tyr Trp Gly Lys Ala Asp Glu Thr Tyr He He Pro Met Asn Asn 

20 25 30 

Ser Leu Ser Val Thr Leu Asp Arg Phe Tyr Thr Glu Thr Lys Val Thr 
35 40 45 

Phe Asp Pro Asp Phe Thr Glu Asp Cys Leu He Leu Asn Gly Asn Glu 
50 55 60 

Val Asn Ala Lys Glu Lys Glu Lys He Gin Asn Tyr Met Asn He Val 
65 70 75 80 

Arg Asp Leu Ala Gly Asn Arg Leu His Ala Arg He Glu Ser Glu Asn 

85 90 95 

Tyr Val Pro Thr Ala Ala Gly Leu Ala Ser Ser Ala Ser Ala Tyr Ala 

100 105 110 



Ala Leu Ala Ala Ala Cys Asn Glu Ala Leu Ser Leu Asn Leu Ser 



Asp 
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115 

Thr Asp Leu Ser Arg 
130 

Ser lie Phe Gly Gly 
145 

Thr Ser Tyr Ala His 

165 

Ser Met lie Phe Val 

180 

Arg Ser Gly Met Ser 
195 

Trp Leu Asp His Val 
210 

Lys Asn Gin Asp Phe 
225 

Leu Arg Met His Ala 

245 

Leu Val Gin Glu Ser 

260 

Lys Ala Asn Leu Pro 
275 

Lys Val Leu Val Glu 
290 

Leu Lys Val Phe Asp 
305 

Ser Gly Val Glu lie I 

325 

<210> 91 
<211> 327 
<212> PRT 

<213> Staphylococcus 



120 

Leu Ala Arg Arg Gly Ser 
135 

Phe Ala Glu Trp Glu Lys 
150 155 

Gly He Asn Ser Asn Gly 

170 

Val He Asn Asn Gin Ser 

185 

Leu Thr Arg Asp Thr Ser 

200. 

Asp Glu Asp Leu Asn Glu 
215 

Gin Arg Leu Gly Glu Val 
230 235 

Thr Asn Leu Gly Ala Gin 

250 

Tyr Asp Ala Met Ala He 

265 

Cys Tyr Phe Thr Met Asp 
280 

Lys Lys Asn Lys Gin Ala 
295 

Glu Ser Lys He He Ala 
310 315 

e Lys 



aureus 



125 

Gly Ser Ala Ser Arg 
140 

Gly His Asp Asp Leu 

160 

Trp Glu Lys Asp Leu 

175 

Lys Lys Val Ser Ser 
190 

Arg Phe Tyr Gin Tyr 
205 

Ala Lys Glu Ala Val 
220 

He Glu Ala Asn Gly 

240 

Pro Pro Phe Thr Tyr 

255 

Val Glu Gin Cys Arg 
270 

Ala Gly Pro Asn Val 
285 

Val Met Glu Gin Phe 
300 

Ser Asp He He Ser 

320 



<400> 91 

Met He Lys Ser 
1 

Lys Tyr Trp Gly 



Gly Lys Ala Arg 
5 

Lys Lys Asp Glu 



Ala His Thr Asn 
10 

Ala Leu He He 



lie Ala Leu He 
15 

Pro Met Asn Asn 
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20 25 30 

Ser lie Ser Val Thr Leu Glu Lys Phe Tyr Thr Glu Thr Lys Val Thr 
35 40 45 

Phe Asn Asp Gin Leu Thr Gin Asp Gin Phe Trp Leu Asn Gly Glu Lys 
50 55 60 

Val Ser Gly Lys Glu Leu Glu Lys lie Ser Lys Tyr Met Asp lie Val 
65 70 75 80 

Arg Asn Arg Ala Gly lie Asp Trp Tyr Ala Glu lie Glu Ser Asp Asn 

85 90 95 

Phe Val Pro Thr Ala Ala Gly Leu Ala Ser Ser Ala Ser Ala Tyr Ala 

100 105 110 

Ala Leu Ala Ala Ala Cys Asn Gin Ala Leu Asp Leu Gin Leu Ser Asp 
115 120 125 

Lys Asp Leu Ser Arg Leu Ala Arg He Gly Ser Gly Ser Ala Ser Arg 
130 135 140 

Ser He Tyr Gly Gly Phe Ala Glu Trp Glu Lys Gly Tyr Asn Asp Glu 
145 150 155 160 

Thr Ser Tyr Ala Val Pro Leu Glu Ser Asn His Phe Glu Asp Asp Leu 

165 170 175 

Ala Met He Phe Val Val He Asn Gin His Ser Lys Lys Val Pro Ser 

180 185 190 

Arg Tyr Gly Met Ser Leu Thr Arg Asn Thr Ser Arg Phe Tyr Gin Tyr 
195 200 205 

Trp Leu Asp His He Asp Glu Asp Leu Ala Glu Ala Lys Ala Ala He 
210 215 220 

Gin Asp Lys Asp Phe Lys Arg Leu Gly Glu Val He Glu Glu Asn Gly 
225 230 235 240 

Leu Arg Met His Ala Thr Asn Leu Gly Ser Thr Pro Pro Phe Thr Tyr 

245 250 255 

Leu Val Gin Glu Ser Tyr Asp Val Met Ala Leu Val His Glu Cys Arg 

260 265 270 

Glu Ala Gly Tyr Pro Cys Tyr Phe Thr Met Asp Ala Gly Pro Asn Val 
275 280 285 

Lys lie Leu Val Glu Lys Lys Asn Lys Gin Gin He lie Asp Lys Leu 
290 295 300 

Leu Thr Gin Phe Asp Asn Asn Gin He He Asp Ser Asp lie lie Ala 
305 310 315 320 

Thr Gly He Glu He lie Glu 
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325 

<210> 92 
<211> 350 
<212> PRT 

<213> Streptomyces sp . CL190 



<400> 92 

Met Arg Ser Glu 
1 

Gly Ser Ala Ala 

20 



lie Lys Tyr Trp 
35 

Thr Ser Leu Ser 
50 

Arg Leu Asp Pro 
65 

Val Ala Thr Gly 



Val Arg Glu Val 

100 

Asn Thr Val Pro 
115 

Ala Ala Leu Ala 
130 

Asp Arg Gly Leu 
145 

Arg Ser lie Phe 



Thr Ala Thr Glu 

180 

Ala Asp Leu Asp 
195 

Lys Pro Val Ser 
210 

Pro Leu Tyr Arg 



His Pro Thr Thr 
5 

Gly Ala Thr Ala 



Gly Lys Arg Asp 

40 

Met Thr Leu Asp 
55 

Ala Ala Glu His 
70 

Glu Thr Leu Arg 
85 

Ala Gly Ser Asp 



Thr Gly Ala Gly 

120 

Val Ala Ala Ala 
135 

Ser Arg Leu Ala 
150 

Gly Gly Phe Ala 
165 

Ala Asp Leu Gly 



Pro Ala Leu Val 

200 

Ser Arg Glu Ala 
215 

Pro Trp Ala Asp 



Thr Val Leu Gin 
10 

Val Ala His Pro 
25 

Glu Arg Leu lie 



Val Phe Pro Thr 

60 

Asp Thr Ala Ala 
75 

Arg lie Ser Ala 
90 

Gin Arg Ala Val 
105 

Leu Ala Ser Ser 



Ala Ala Tyr Gly 

140 

Arg Arg Gly Ser 
155 

Val Trp His Ala 
170 

Ser Tyr Ala Glu 
185 

lie Ala Val Val 



Met Arg Arg Thr 

220 

Ser Ser Lys Asp 
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Ser Arg Glu Gin 
15 

Asn lie Ala Leu 
30 

Leu Pro Cys Thr 
45 

Thr Thr Glu Val 



Leu Asn Gly Glu 

80 

Phe Leu Ser Leu 
95 

Val Asp Thr Arg 
110 

Ala Ser Gly Phe 
125 

Leu Glu Leu Asp 



Gly Ser Ala Ser 

160 

Gly Pro Asp Gly 
175 

Pro Val Pro Ala 
190 

Asn Ala Gly Pro 
205 

Val Asp Thr Ser 



Asp Leu Asp Glu 
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225 230 235 240 

Met Arg Ser Ala Leu Leu Arg Gly Asp Leu Glu Ala Val Gly Glu lie 

245 250 255 

Ala Glu Arg Asn Ala Leu Gly Met His Ala Thr Met Leu Ala Ala Arg 

260 265 270 

Pro Ala Val Arg Tyr Leu Ser Pro Ala Thr Val Thr Val Leu Asp Ser 

275 280 285 

Val Leu Gin Leu Arg Lys Asp Gly Val Leu Ala Tyr Ala Thr Met Asp 

290 295 300 

Ala Gly Pro Asn Val Lys Val Leu Cys Arg Arg Ala Asp Ala Glu Arg 

305 310 315 320 

h Val Ala Asp Val Val Arg Ala Ala Ala Ser Gly Gly Gin Val Leu Val 



325 330 335 

Ala Gly Pro Gly Asp Gly Ala Arg Leu Leu Ser Glu Gly Ala 

340 345 350 

<210> 93 
<211> 331 
<212> PRT 

<213> Streptomyces griseolosporeus 



<400> 93 

Ala Thr Ala Val Ala Gin Pro Asn lie Ala Leu lie Lys Tyr Trp Gly 
15 10 15 

Lys Lys Asp Glu His Leu Val Leu Pro Arg Thr Asp Ser Leu Ser Met 

20 25 30 

Thr Leu Asp lie Phe Pro Thr Thr Thr Arg Val Gin Leu Ala Pro Gly 
35 40 45 

Ala Gly Gin Asp Thr Val Ala Phe Asn Gly Glu Pro Ala Thr Gly Glu 
50 55 60 

Ala Glu Arg Arg lie Thr Ala Phe Leu Arg Leu Val Arg Glu Arg Ser 
65 70 75 80 

Gly Arg Thr Glu Arg Ala Arg Val Glu Thr Glu Asn Thr Val Pro Thr 

85 90 95 

Gly Ala Gly Leu Ala Ser Ser Ala Ser Gly Phe Ala Ala Leu Ala Val 

100 105 110 

Ala Ala Ala Ala Ala Tyr Gly Leu Gly Leu Asp Ala Arg Gly Leu Ser 
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115 120 125 

Arg Leu Ala Arg Arg Gly Ser Gly Ser Ala Ser Arg Ser lie Phe Asp 
130 135 140 

Gly Phe Ala Val Trp His Ala Gly His Ala Gly Gly Thr Pro Glu Glu 
145 150 155 160 

Ala Asp Leu Gly Ser Tyr Ala Glu Pro Val Pro Ala Val Asp Leu Glu 

165 170 175 

Pro Ala Leu Val Val Ala Val Val Ser Ala Ala Pro Lys Ala Val Ser 

180 185 190 

Ser Arg Glu Ala Met Arg Arg Thr Val Asp Thr Ser Pro Leu Tyr Glu 
195 200 205 

Pro Trp Ala Val Ser Ser Arg Ala Asp Leu Ala Asp lie Gly Ala Ala 
210 215 220 

Leu Ala Arg Gly Asn Leu Pro Ala Val Gly Glu lie Ala Glu Arg Asn 
225 230 235 240 

Ala Leu Gly Met His Ala Thr Met Leu Ala Ala Arg Pro Ala Val Arg 

245 250 255 

Tyr Leu Ser Pro Ala Ser Leu Ala Val Leu Asp Gly Val Leu Gin Leu 

260 265 270 

Arg Arg Asp Gly Val Pro Ala Tyr Ala Thr Met Asp Ala Gly Pro Asn 
275 280 285 

Val Lys Val Leu Cys Pro Arg Ser Asp Ala Glu Arg Val Ala Glu Ala 
290 295 300 

Leu Arg Ala Ala Ala Pro Val Gly Ala Val His He Ala Gly Pro Gly 
305 310 315 320 

Arg Gly Ala Arg Leu Val Ala Glu Glu Cys Arg 

325 330 

<210> 94 
<211> 312 
<212> PRT 

<213> Borrelia burgdorferi 



<400> 94 

Met Lys He Lys Cys Lys Val His Ala Ser Leu Ala Leu He Lys Tyr 
15 10 15 

Trp Gly Lys Lys Asp Val Phe Leu Asn He Pro Ala Thr Ser Ser Leu 
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20 

Ala Val Ser Val 
35 

Asn Arg Asp Glu 
50 

Arg Glu Lys Val 
65 

Asn Val Arg Phe 



Gly Leu Ala Ser 

100 

Leu Lys Tyr Phe 
115 

Arg Val Gly Ser 
130 

lie Leu Lys Glu 
145 

Tyr Phe Asn Asp 



Lys Glu Leu Ser 

180 

Phe Tyr Tyr Asp 
195 

Ala Leu Tyr Phe 
210 

lie Val Lys Ser 
225 

lie Phe Tyr Phe 



Asp Leu Arg Asn 

260 

Pro Gin Val Lys 
275 

Lys Gly Leu Lys 
290 

Val Gly Cys Asp 
305 

<210> 95 



Asp Lys Phe Tyr 

40 

lie lie Leu Asn 
55 

Phe Phe Asp Tyr 
70 

Lys lie Lys Ser 
85 

Ser Ser Ser Gly 



Asn Lys Tyr Ser 

120 

Ala Ser Ala Ala 
135 

Gly Ser Lys Glu 
150 

Leu Arg lie lie 
165 

Ser Arg Ala Ala 



Ala Trp lie Ala 

200 

Phe Leu Lys Lys 
215 

Tyr Gin Asn Met 
230 

Lys Asn Ser Thr 
245 

Glu Gly lie Phe 



Phe Leu Cys Leu 

280 

Gin Asn Phe Thr 
295 

Leu Glu Trp lie 
310 



25 

Ser lie Ser Glu 



Ser Lys Pro Val 

60 

Ala Arg Lys lie 
75 

Lys Asn Asn Phe 
90 

Phe Ala Ser lie 
105 

Cys Asn Ser Ala 



Arg Ala lie Tyr 

140 

Ser Phe Gin Leu 
155 

Phe Ala lie lie 
170 

Met Asn lie Cys 
185 

Ser Ser Lys Lys 



Asp Phe lie His 

220 

Phe Ala Leu Met 
235 

lie Asp Leu lie 
250 

Val Phe Glu Thr 
265 

Glu Glu Asn Leu 



Gly lie Asp Phe 

300 



30 

Leu Glu Leu Ser 
45 

lie Leu Lys Asn 



Leu Asn Glu Pro 

80 

Pro Thr Ala Ala 
95 

Ala Ala Cys lie 
110 

Ser Asn Leu Ala 
125 

Gly Gly Phe Thr 



Arg Asp Gin Ser 

160 

Asp Ser Asn Glu 
175 

Lys Arg His Lys 
190 

lie Phe Lys Asp 
205 

Phe Gly Ala Thr 



Phe Ala Ser Ser 

240 

Arg Tyr Ala Ala 
255 

Met Asp Ala Gly 
270 

Asn Thr lie Leu 
285 

lie Val Ser Lys 
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<211> 292 
<212> PRT 

<213> Streptococcus pneumoniae 



<400> 95 

Met Thr Lys Lys Val Gly Val Gly Gin Ala His Ser Lys lie lie Leu 
15 10 15 

He Gly Glu His Ala Val Val Tyr Gly Tyr Pro Ala He Ser Leu Pro 

20 25 30 

Leu Leu Glu Val Glu Val Thr Cys Lys Val Val Ser Ala Glu Ser Pro 
35 40 45 

Trp Arg Leu Tyr Glu Glu Asp Thr Leu Ser Met Ala Val Tyr Ala Ser 
50 55 60 

Leu Glu Tyr Leu Asp He Thr Glu Ala Cys Val Arg Cys Glu He Asp 
65 70 75 80 

Ser Ala He Pro Glu Lys Arg Gly Met Gly Ser Ser Ala Ala He Ser 

85 90 95 

He Ala Ala lie Arg Ala Val Phe Asp Tyr Tyr Gin Ala Asp Leu Pro 

100 105 110 

His Asp Val Leu Glu He Leu Val Asn Arg Ala Glu Met He Ala His 
115 120 125 

Met Asn Pro Ser Gly Leu Asp Ala Lys Thr Cys Leu Ser Asp Gin Pro 
130 135 140 

He Arg Phe He Lys Asn Val Gly Phe Thr Glu Leu Glu Met Asp Leu 
145 150 155 160 

Ser Ala Tyr Leu Val He Ala Asp Thr Gly Val Tyr Gly His Thr Arg 

165 170 175 

Glu Ala He Gin Val Val Gin Asn Lys Gly Lys Asp Ala Leu Pro Phe 

180 185 190 

Leu His Ala Leu Gly Glu Leu Thr Gin Gin Ala Glu Val Ala He Ser 
195 200 205 

Gin Lys Tyr Ala Glu Gly Leu Gly Leu He Phe Ser Gin Ala His Leu 
210 215 220 

His Leu Lys Glu He Gly Val Ser Ser Pro Glu Ala Asp Phe Leu Val 
225 230 235 240 

Glu Thr Ala Leu Ser Tyr Gly Ala Leu Gly Ala Lys Met Ser Gly Gly 
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245 250 255 

Gly Leu Gly Gly Cys lie lie Ala Leu Val Thr Asn Leu Thr His Ala 

260 265 270 

Gin Glu Leu Ala Glu Arg Leu Glu Glu Lys Gly Ala Val Gin Thr Trp 
275 280 285 

lie Glu Ser Leu 
290 

<210> 96 

<211> 292 

<212> PRT 

<213> Streptococcus pyrogenes 



<400> 96 

Met Asn Glu Asn lie Gly Tyr Gly Lys Ala His Ser Lys He He Leu 
15 10 15 

He Gly Glu His Ala Val Val Tyr Gly Tyr Pro Ala He Ala Leu Pro 

20 25 30 

Leu Thr Asp He Glu Val Val Cys His He Phe Pro Ala Asp Lys Pro 
35 40 45 

Leu Val Phe Asp Phe Tyr Asp Thr Leu Ser Thr Ala He Tyr Ala Ala 
50 55 60 

Leu Asp Tyr Leu Gin Arg Leu Gin Glu Pro He Ala Tyr Glu He Val 
65 70 75 80 

Ser Gin Val Pro Gin Lys Arg Gly Met Gly Ser Ser Ala Ala Val Ser 

85 90 95 

lie Ala Ala He Arg Ala Val Phe Ser Tyr Cys Gin Glu Pro Leu Ser 

100 105 110 

Asp Asp Leu Leu Glu He Leu Val Asn Lys Ala Glu He lie Ala His 
115 120 125 

Thr Asn Pro Ser Gly Leu Asp Ala Lys Thr Cys Leu Ser Asp His Ala 
130 135 140 

lie Lys Phe He Arg Asn lie Gly Phe Glu Thr He Glu lie Ala Leu 
145 150 155 160 

Asn Gly Tyr Leu lie lie Ala Asp Thr Gly lie His Gly His Thr Arg 

165 170 175 

Glu Ala Val Asn Lys Val Ala Gin Phe Glu Glu Thr Asn Leu Pro Tyr 
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180 185 190 

Leu Ala Lys Leu Gly Ala Leu Thr Gin Ala Leu Glu Arg Ala lie Asn 
195 200 205 

Gin Lys Asn Lys Val Ala lie Gly Gin Leu Met Thr Gin Ala His Ser 
210 215 220 

Ala Leu Lys Ala lie Gly Val Ser lie Ser Lys Ala Asp Gin Leu Val 
225 230 235 240 

Glu Ala Ala Leu Arg Ala Gly Ala Leu Gly Ala Lys Met Thr Gly Gly 

245 250 255 

Gly Leu Gly Gly Cys Met lie Ala Leu Ala Asp Thr Lys Asp Met Ala 

260 265 270 

Glu Lys lie Ser His Lys Leu Lys Glu Glu Gly Ala Val Asn Thr Trp s ~ s 

275 280 285 ( . ' 

lie Gin Met Leu 
290 

<210> 97 

<211> 314 

<212> PRT 

<213> Enterococcus faecalis 



<400> 97 

Met Asn lie Lys Lys Gin Gly Leu Gly Gin Ala Thr Gly Lys lie He 
15 10 15 

f > 

Leu Met Gly Glu His Ala Val Val Tyr Gly Glu Pro Ala He Ala Phe V 

20 25 30 

Pro Phe Gin Ala Thr Glu He Thr Ala Val Phe Thr Pro Ala Lys Thr 
35 40 45 

Met Gin He Asp Cys Ala Tyr Phe Thr Gly Leu Leu Glu Asp Val Pro 
50 55 60 

Gin Glu Leu Ala Asn He Lys Glu Val Val Gin Gin Thr Leu His Phe 
65 70 75 80 

Leu Lys Glu Asp Thr Phe Lys Gly Thr Leu Thr Leu Thr Ser Thr He 

85 90 95 

Pro Ala Glu Arg Gly Met Gly Ser Ser Ala Ala Thr Ala Val Ala He 

100 105 110 

Val Arg Ser Leu Phe Asp Tyr Phe Asp Tyr Ala Tyr Thr Tyr Gin Glu 
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115 120 125 

Leu Phe Glu Leu Val Ser Leu Ser Glu Lys lie Ala His Gly Asn Pro 
130 135 140 

Ser Gly lie Asp Ala Ala Ala Thr Ser Gly Ala Asp Pro Leu Phe Phe 
145 150 155 160 

Thr Arg Gly Phe Pro Pro Thr His Phe Ser Met Asn Leu Ser Asn Ala 

165 170 175 

Tyr Leu Val Val Ala Asp Thr Gly lie Lys Gly Gin Thr Arg Glu Ala 

180 185 190 

Val Lys Asp He Ala Gin Leu Ala Gin Asn Asn Pro Thr Ala He Ala 
195 200 205 

Glu Thr Met Lys Gin Leu Gly Ser Phe Thr Lys Glu Ala Lys Gin Ala 
210 215 220 

He Leu Gin Asp Asp Lys Gin Lys Leu Gly Gin Leu Met Thr Leu Ala 
225 230 235 240 

Gin Glu Gin Leu Gin Gin Leu Thr Val Ser Asn Asp Met Leu Asp Arg 

245 250 255 

Leu Val Ala Leu Ser Leu Glu His Gly Ala Leu Gly Ala Lys Leu Thr 

260 265 270 

Gly Gly Gly Arg Gly Gly Cys Met He Ala Leu Thr Asp Asn Lys Lys 
275 280 285 

Thr Ala Gin Thr He Ala Gin Thr Leu Glu Glu Asn Gly Ala Val Ala 
290 295 300 

Thr Trp He Gin Ser Leu Glu Val Lys Lys 
305 310 

<210> 98 

<211> 314 

<212> PRT 

<213> Enterococcus faecium 



<400> 98 

Met Ala Asn Tyr Gly Gin Gly Glu Ser Ser Gly Lys lie He Leu Met 
15 10 15 

Gly Glu His Ala Val Val Tyr Gly Glu Pro Ala lie Ala Phe Pro Phe 

20 25 30 

Tyr Ala Thr Lys Val Thr Ala Phe Leu Glu Glu Leu Asp Ala Met Asp 
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35 40 45 

Asp Gin Leu Val Ser Ser Tyr Tyr Ser Gly Asn Leu Ala Glu Ala Pro 
50 55 60 

His Ala Leu Lys Asn He Lys Lys Leu Phe He His Leu Lys Lys Gin 
65 70 75 80 

His Asp He Gin Lys Asn Leu Gin Leu Thr He Glu Ser Thr He Pro 

85 90 95 

Ala Glu Arg Gly Met Gly Ser Ser Ala Ala Val Ala Thr Ala Val Thr 

100 105 HO 

Arg Ala Phe Tyr Asp Tyr Leu Ala Phe Pro Leu Ser Arg Glu lie Leu 
115 120 125 

Leu Glu Asn Val Gin Leu Ser Glu Lys He Ala His Gly Asn Pro Ser 

130 135 140 ( 

Gly He Asp Ala Ala Ala Thr Ser Ser Leu Gin Pro He Tyr Phe Thr 
145 150 155 160 

Lys Gly His Pro Phe Asp Tyr Phe Ser Leu Asn lie Asp Ala Phe Leu 

165 170 175 

He Val Ala Asp Thr Gly He Lys Gly Gin Thr Arg Glu Ala Val Lys 

180 185 190 

Asp Val Ala His Leu Phe Glu Thr Gin Pro His Glu Thr Gly Gin Met 
195 200 205 

He Gin Lys Leu Gly Tyr Leu Thr Lys Gin Ala Lys Gin Ala He He 
210 215 220 

Glu Asn Ser Pro Glu Thr Leu Ala Gin Thr Met Asp Glu Ser Gin Ser 

225 230 235 240 

Leu Leu Glu Lys Leu Thr lie Ser Asn Asp Phe Leu Asn Leu Leu He ( 

245 250 255 

Gin Thr Ala Lys Asp Thr Gly Ala Leu Gly Ala Lys Leu Thr Gly Gly 

260 265 270 

Gly Arg Gly Gly Cys Met lie Ala Leu Ala Gin Thr Lys Thr Lys Ala 
275 280 285 

Gin Glu lie Ser Gin Ala Leu Glu Asp Ala Gly Ala Ala Glu Thr Trp 
290 295 300 

lie Gin Gly Leu Gly Val His Thr Tyr Val 
305 310 

<210> 99 
<211> 307 
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<212> PRT 

<213> Staphylococcus haemolyticus 



<400> 99 

Met Val Gin Arg Gly Tyr Gly Glu Ser Asn Gly Lys He He Leu He 
15 10 15 

Gly Glu His Ala Val Thr Phe Gly Glu Pro Ala He Ala He Pro Phe 

20 25 30 

Thr Ser Gly Lys Val Lys Val Leu He Glu Ser Leu Glu Lys Gly Asn 
35 40 45 

Tyr Ser Ala He Gin Ser Asp Val Tyr Asp Gly Pro Leu Tyr Asp Ala 
50 55 60 

Pro Glu His Leu Lys Ser Leu He Gly His Phe Val Glu Asn Lys Lys 
65 70 75 80 

Val Glu Glu Pro Leu Leu He Lys He Gin Ala Asn Leu Pro Pro Ser 

85 90 95 

Arg Gly Leu Gly Ser Ser Ala Ala Val Ala Val Ala Phe He Arg Ala 

100 105 110 

Ser Tyr Asp Tyr Leu Gly Leu Pro Leu Thr Asp Lys Glu Leu Leu Glu 
115 120 125 

Asn Ala Asp Trp Ala Glu Arg He Ala His Gly Lys Pro Ser Gly He 
130 135 140 

Asp Thr Lys Thr He Val Thr Asn Gin Pro Val Trp Tyr Gin Lys Gly 
145 150 155 160 

Glu Val Glu He Leu Lys Thr Leu Asp Leu Asp Gly Tyr Met Val Val 

165 170 175 

lie Asp Thr Gly Val Lys Gly Ser Thr Lys Gin Ala Val Glu Asp Val 

180 185 190 

His Gin Leu Cys Asp Asn Asp Lys Asn Tyr Met Gin Val Val Lys His 
195 200 205 

lie Gly Ser Leu Val Tyr Ser Ala Ser Glu Ala He Glu His His Ser 
210 215 220 

Phe Asp Gin Leu Ala Thr lie Phe Asn Gin Cys Gin Asp Asp Leu Arg 
225 230 235 240 

Thr Leu Thr Val Ser His Asp Lys He Glu Met Phe Leu Arg Leu Gly 

245 250 255 

Glu Glu Asn Gly Ser Val Ala Gly Lys Leu Thr Gly Gly Gly Arg Gly 
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260 265 270 

Gly Ser Met Leu He Leu Ala Lys Glu Leu Gin Thr Ala Lys Asn He 
275 280 285 

Val Ala Ala Val Glu Lys Ala Gly Ala Gin His Thr Trp He Glu Lys 
290 295 300 

Leu Gly Gly 
305 

<210> 100 
<211> 306 
<212> PRT 

<213> Staphylococcus epidermis 



<400> 100 

Met Thr Arg Gin 
1 

Gly Glu His Ala 

20 



Asn Ala Gly Lys 
35 

Tyr Ser Ser He 
50 

Pro Glu His Leu 
65 

Val Lys Glu Pro 



Arg Gly Leu Gly 

100 



Ser Tyr Asp Phe 
115 

Glu Ala Asn Trp 
130 

Asp Thr Gin Thr 
145 

Gin Ala Glu Thr 



He Asp Thr Gly 



Gly Tyr Gly Glu 
5 

Val Thr Phe Gly 



He Lys Val Leu 

40 

Thr Ser Asp Val 
55 

Lys Ser He He 
70 

Leu Ser Val Lys 
85 

Ser Ser Ala Ala 



Met Asp Gin Pro 

120 

Ala Glu Gin He 
135 

He Val Ser Asn 
150 

Leu Lys Ser Leu 
165 

Val Lys Gly Ser 



Ser Thr Gly Lys 
10 

Gin Pro Ala He 
25 

He Glu Ser Leu 



Tyr Asp Gly Met 

60 

Asn Arg Phe Val 
75 

He Gin Thr Asn 
90 

Val Ala Val Ala 
105 

Leu Asp Asp Lys 



Ala His Gly Lys 

140 

Lys Pro Val Trp 
155 

Lys Leu Asn Gly 
170 

Thr Lys Gin Ala 
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He He Leu Met 
15 

Ala He Pro Phe 
30 

Asp Glu Gly Asn 
45 

Leu Tyr Asp Ala 



Glu Lys Ser Gly 

80 

Leu Pro Pro Ser 
95 

Phe Val Arg Ala 
110 

Thr Leu He Lys 
125 

Pro Ser Gly He 



Phe Lys Gin Gly 

160 

Tyr Met Val Val 
175 

Val Glu Asp Val 
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180 185 190 

His Val Leu Cys Glu Ser Asp Glu Tyr Met Lys Tyr lie Glu His lie 
195 200 205 

Gly Thr Leu Val His Ser Ala Ser Glu Ser lie Glu Gin His Asp Phe 
210 215 220 

His His Leu Ala Asp lie Phe Asn Ala Cys Gin Glu Asp Leu Arg His 
225 230 235 240 

Leu Thr Val Ser His Asp Lys lie Glu Lys Leu Leu Gin lie Gly Lys 

245 250 255 

Glu His Gly Ala lie Ala Gly Lys Leu Thr Gly Gly Gly Arg Gly Gly 

260 265 270 

Ser Met Leu Leu Leu Ala Glu Asn Leu Lys Thr Ala Lys Thr lie Val 
<Q 275 280 285 

Ala Ala Val Glu Lys Ala Gly Ala Ala His Thr Trp lie Glu His Leu 
290 295 300 

Gly Gly 
305 

<210> 101 
<211> 306 
<212> PRT 

<213> Staphylococcus aureus 



<400> 101 




Met Thr Arg Lys 
1 



Gly Glu His Ala 

20 

Asn Ala Gly Lys 
35 

Tyr Ser Ser lie 
50 

Pro Asp His Leu 
65 

lie Thr Glu Pro 



Arg Gly Leu Gly 



Gly Tyr Gly Glu 
5 

Val Thr Phe Gly 



lie Lys Val Leu 

40 

Lys Ser Asp Val 
55 

Lys Ser Leu Val 
70 

Leu Ala Val Thr 
85 

Ser Ser Ala Ala 



Ser Thr Gly Lys 
10 

Glu Pro Ala lie 
25 

lie Glu Ala Leu 



Tyr Asp Gly Met 

60 

Asn Arg Phe Val 
75 

lie Gin Thr Asn 
90 

Val Ala Val Ala 



lie lie Leu lie 
15 

Ala Val Pro Phe 
30 

Glu Ser Gly Asn 
45 

Leu Tyr Asp Ala 



Glu Leu Asn Asn 

80 

Leu Pro Pro Ser 
95 

Phe Val Arg Ala 



129 



BNSDOCID: <WO. 



0209909 5A2_I_> 



PCT/EP02/06171 

WO 02/099095 



100 105 HO 

Ser Tyr Asp Phe Leu Gly Lys Ser Leu Thr Lys Glu Glu Leu lie Glu 
115 120 125 

Lys Ala Asn Trp Ala Glu Gin He Ala His Gly Lys Pro Ser Gly He 
130 135 140 

Asp Thr Gin Thr He Val Ser Gly Lys Pro Val Trp Phe Gin Lys Gly 
145 150 155 160 

Gin Ala Glu Thr Leu Lys Thr Leu Ser Leu Asp Gly Tyr Met Val Val 

165 170 175 

He Asp Thr Gly Val Lys Gly Ser Thr Arg Gin Ala Val Glu Asp Val 

180 185 190 

His Lys Leu Cys Glu Asp Pro Gin Tyr Met Ser His Val Lys His He f 
195 200 205 

Gly Lys Leu Val Leu Arg Ala Ser Asp Val He Glu His His Asn Phe 
210 215 220 

Glu Ala Leu Ala Asp He Phe Asn Glu Cys His Ala Asp Leu Lys Ala 
225 230 235 240 

Leu Thr Val Ser His Asp Lys He Glu Gin Leu Met Lys He Gly Lys 

245 250 255 

Glu Asn Gly Ala He Ala Gly Lys Leu Thr Gly Ala Gly Arg Gly Gly 

260 265 270 

Ser Met Leu Leu Leu Ala Lys Asp Leu Pro Thr Ala Lys Asn He Val 
275 280 285 

Lys Ala Val Glu Lys Ala Gly Ala Ala His Thr Trp He Glu Asn Leu 
290 295 300 



Gly Gly 
305 

<210> 102 
<211> 345 
<212> PRT 

<213> Streptomyces sp . CL190 



r 



<400> 102 

Met Gin Lys Arg Gin Arg Glu Leu Ser Ala Leu Thr Leu Pro Thr Ser 
1 5 10 15 

Ala Glu Gly Val Ser Glu Ser His Arg Ala Arg Ser Val Gly He Gly 

130 



BNSDOCID <WO - 4 3g Q99095A2. 



WO 02/099095 



PCT/EP02/06171 



20 25 30 

Arg Ala His Ala Lys Ala lie Leu Leu Gly Glu His Ala Val Val Tyr 
35 40 45 

Gly Ala Pro Ala Leu Ala Leu Pro lie Pro Gin Leu Thr Val Thr Ala 
50 55 60 

Ser Val Gly Trp Ser Ser Glu Ala Ser Asp Ser Ala Gly Gly Leu Ser 
65 70 75 80 

Tyr Thr Met Thr Gly Thr Pro Ser Arg Ala Leu Val Thr Gin Ala Ser 

85 90 95 

Asp Gly Leu His Arg Leu Thr Ala Glu Phe Met Ala Arg Met Gly Val 

100 105 110 

Thr Asn Ala Pro His Leu Asp Val lie Leu Asp Gly Ala lie Pro His 
115 120 125 

Gly Arg Gly Leu Gly Ser Ser Ala Ala Gly Ser Arg Ala He Ala Leu 
130 135 140 

Ala Leu Ala Asp Leu Phe Gly His Glu Leu Ala Glu His Thr Ala Tyr 
145 150 155 160 

Glu Leu Val Gin Thr Ala Glu Asn Met Ala His Gly Arg Ala Ser Gly 

165 170 175 

Val Asp Ala Met Thr Val Gly Ala Ser Arg Pro Leu Leu Phe Gin Gin 

180 185 190 

Gly Arg Thr Glu Arg Leu Ala He Gly Cys Asp Ser Leu Phe He Val 
195 200 205 

Ala Asp Ser Gly Val Pro Gly Ser Thr Lys Glu Ala Val Glu Met Leu 
210 215 220 

Arg Glu Gly Phe Thr Arg Ser Ala Gly Thr Gin Glu Arg Phe Val Gly 
225 230 235 240 

Arg Ala Thr Glu Leu Thr Glu Ala Ala Arg Gin Ala Leu Ala Asp Gly 

245 250 255 

Arg Pro Glu Glu Leu Gly Ser Gin Leu Thr Tyr Tyr His Glu Leu Leu 

260 265 270 

His Glu Ala Arg Leu Ser Thr Asp Gly He Asp Ala Leu Val Glu Ala 
275 280 285 

Ala Leu Lys Ala Gly Ser Leu Gly Ala Lys He Thr Gly Gly Gly Leu 
290 295 300 

Gly Gly Cys Met He Ala Gin Ala Arg Pro Glu Gin Ala Arg Glu Val 
305 310 315 320 

Thr Arg Gin Leu His Glu Ala Gly Ala Val Gin Thr Trp Val Val Pro 
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325 330 335 



Leu Lys Gly Leu Asp Asn His Ala Gin 

340 345 

<210> 103 
<211> 334 
<212> PRT 

<213> Streptomyces griseolosporeus 



<400> 103 

Met Thr Leu Pro Thr Ser Val Glu Glu Gly Ser Lys Ala His Arg Ala ( 
1 5 10 15 

Arg Ala Val Gly Thr Gly Arg Ala His Ala Lys Ala lie Leu Leu Gly 

20 25 30 

Glu His Ala Val Val Tyr Gly Thr Pro Ala Leu Ala Met Pro He Pro 
35 40 45 

Gin Leu Ala Val Thr Ala Ser Ala Gly Trp Ser Gly Arg Ser Ala Glu 
50 55 60 

Ser Ara Gly Gly Pro Thr Phe Thr Met Thr Gly Ser Ala Ser Arg Ala 
65 70 75 80 

Val Thr Ala Gin Ala Leu Asp Gly Leu Arg Arg Leu Thr Ala Ser Val 

85 90 95 

Lys Ala His Thr Gly Val Thr Asp Gly Gin His Leu Asp Val Ser Leu 

100 105 HO 



Asp Gly Ala He Pro Pro Gly Arg Gly Leu Gly Ser Ser Ala Ala Asn 
115 120 125 

Ala Arg Ala He He Leu Ala Leu Ala Asp Leu Phe Gly Arg Glu Leu 
130 135 140 

Thr Glu Gly Glu Val Phe Asp Leu Val Gin Glu Ala Glu Asn Leu Thr 
145 150 155 160 

His Gly Arg Ala Ser Gly Val Asp Ala Val Thr Val Gly Ala Thr Ala 

165 170 175 

Pro Leu Leu Phe Arg Ala Gly Thr Ala Gin Ala Leu Asp He Gly Cys 

180 185 190 

Asp Ala Leu Phe Val Val Ala Asp Ser Gly Thr Ala Gly Ser Thr Lys 
195 200 205 

Glu Ala He Glu Leu Leu Arg Ala Gly Phe Arg Ala Gly Ala Gly Lys 
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210 215 220 

Glu Glu Arg Phe Met His Arg Ala Ala His Leu Val Asp Asp Ala Arg 
225 230 235 240 

Ala Ser Leu Ala Glu Gly Glu Pro Glu Ala Phe Gly Ser Cys Leu Thr 

245 250 255 

Glu Tyr His Gly Leu Leu Arg Gly Ala Gly Leu Ser Thr Asp Arg lie 

260 265 270 

Asp Ala Leu Val Asp Ala Ala Leu Gin Ala Asp Ser Leu Gly Ala Lys 
275 280 285 

He Thr Gly Gly Gly Leu Gly Gly Cys Val Leu Ala Met Ser Arg Pro 
290 295 300 

Glu Arg Ala Glu Glu Val Ala Arg Gin Leu His Ala Ala Gly Ala Val 
305 310 315 320 

Arg Thr Trp Ala Val Gin Leu Arg Arg Ser Thr His Glu Arg 

325 330 



<210> 


104 


<211> 


296 


<212> 


PRT 


<213> 


Borrelia 


<400> 


104 



Met Leu Arg He Arg Lys Pro Ala Lys He Leu Phe Leu Gly Glu His 
1 5 10 15 

Ser Ala Val Tyr Gly Phe Pro Val He Gly Ala Thr Val Pro He Tyr 

20 25 30 

Met Asp Leu He Tyr Ser Val Ser Lys Asn Trp Lys Tyr Leu Gly Lys 
35 40 45 

Pro Ser Thr Arg Leu Asn Ser Leu He Ser Phe He Val Ser Asn Tyr 
50 55 60 

Ser Lys Val Asn Pro He Glu Phe Asp He He Ser Glu He Pro He 
65 70 75 80 

Gly Val Gly Leu Gly Ser Ser Ala Ser Leu Ser Leu Cys Phe Ala Glu 

85 90 95 

Tyr He Thr Ser His Phe Glu Tyr Lys Asp Cys Asn Lys He Leu Leu 

100 105 HO 

Ala Asn Gin He Glu Asn He Phe His Gly Lys Ser Ser Gly Met Asp 
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115 120 125 

lie Arg Leu lie Asp Leu Asn Gly Thr Phe Tyr Leu Glu Lys Lys Glu 
130 135 140 

Asn Val Leu His Ser Lys Lys He Lys Asp Ser Gly Phe Tyr Phe Leu 
145 150 155 160 

He Gly Ala He Lys Arg Asp Leu Thr Thr Lys Glu He Val Val Asn 

165 170 175 

Leu Lys Lys Ar;p Leu Leu Ser Asn Ala Tyr Leu Phe Val Phe He Glu 

180 185 190 

Lys Leu Gly Leu Ala Val Ser Asn Ser Tyr Ala Ser Phe Gin Asn Lys 
195 200 205 

Asp Val Tyr Ser Leu Ala Asn Glu Met Asn He Ala Gin Cys Cys Leu 
210 215 220 

Lys Arg Leu Gly Leu Ser Asn Asp Thr Leu Asp Trp Leu He Ser Glu 
225 230 235 240 

Gly He Lys Leu Gly Ala Leu Ser Gly Lys Leu Ser Gly Ala Gly Lys 

245 250 255 

Gly Gly Ala Phe He Phe Leu Phe Glu Ser Leu He Lys Ala Asn He 

260 265 270 

Val Gin Lys Glu Leu Asn Asn Met Leu Asp Ser Lys He Asp Leu Leu 
275 280 285 

Leu Lys Leu Lys Val He Glu Thr 
290 295 

<210> 105 

<211> 336 

<212> PRT 

<213> Streptococcus pneumoniae 



<400> 105 

Met He Ala Val 
1 

Ala He Leu Glu 

20 

Tyr Met Arg Ala 
35 

Asp Met Phe Asp 



Lys Thr Cys Gly 
5 

Pro Gly Gin Leu 

Glu He Ala Phe 

40 

Phe Ala Val Asp 



Lys Leu Tyr Trp 
10 

Ala Leu He Lys 
25 

Ser Asp Ser Tyr 
Leu Arg Pro Asn 

134 



Ala Gly Glu Tyr 
15 

Asp He Pro He 
30 

Arg He Tyr Ser 
45 

Pro Asp Tyr Ser 
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50 55 60 

Leu lie Gin Glu Thr lie Ala Leu Met Gly Asp Phe Leu Ala Val Arg 
65 70 75 80 

Gly Gin Asn Leu Arg Pro Phe Ser Leu Lys lie Cys Gly Lys Met Glu 

85 90 95 

Arg Glu Gly Lys Lys Phe Gly Leu Gly Ser Ser Gly Ser Val Val Val 

100 105 110 

Leu Val Val Lys Ala Leu Leu Ala Leu Tyr Asn Leu Ser Val Asp Gin 
115 120 125 

Asn Leu Leu Phe Lys Leu Thr Ser Ala Val Leu Leu Lys Arg Gly Asp 
130 135 140 

r \ Asn Gly Ser Met Gly Asp Leu Ala Cys He Val Ala Glu Asp Leu Val 

V ^ 145 150 155 160 

Leu Tyr Gin Ser Phe Asp Arg Gin Lys Ala Ala Ala Trp Leu Glu Glu 

165 170 175 

Glu Asn Leu Ala Thr Val Leu Glu Arg Asp Trp Gly Phe Phe He Ser 

180 185 190 

Gin Val Lys Pro Thr Leu Glu Cys Asp Phe Leu Val Gly Trp Thr Lys 
195 200 205 

Glu Val Ala Val Ser Ser His Met Val Gin Gin He Lys Gin Asn He 
210 215 220 

Asn Gin Asn Phe Leu Ser Ser Ser Lys Glu Thr Val Val Ser Leu Val 
225 2 30 235 240 

Glu Ala Leu Glu Gin Gly Lys Ala Glu Lys Val He Glu Gin Val Glu 

2 45 250 255 

Val Ala Ser Lys Leu Leu Glu Gly Leu Ser Thr Asp He Tyr Thr Pro 

2 60 265 270 

Leu Leu Arg Gin Leu Lys Glu Ala Ser Gin Asp Leu Gin Ala Val Ala 
275 280 285 

Lys Ser Ser Gly Ala Gly Gly Gly Asp Cys Gly He Ala Leu Ser Phe 
290 295 300 

Asp Ala Gin Ser Ser Arg Asn Thr Leu Lys Asn Arg Trp Ala Asp Leu 
305 310 315 320 

Gly He Glu Leu Leu Tyr Gin Glu Arg He Gly His Asp Asp Lys Ser 

325 330 335 

<210> 106 
<211> 335 
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<212> PRT 

<213> Streptococcus pyrogenes 



<400> 106 

Met Ser Asn Tyr Cys Val Gin Thr Gly Gly Lys Leu Tyr Leu Thr Gly 
1 5 10 15 

Glu Tyr Ala He Leu He Pro Gly Gin Lys Ala Leu He His Phe He 

20 25 30 

Pro Leu Met Met Thr Ala Glu He Ser Pro Ala Ala His He Gin Leu 
35 40 45 

Ala Ser Asp Met Phe Ser His Lys Ala Gly Met Thr Pro Asp Ala Ser 
50 55 60 

Tvr Ala Leu He Gin Ala Thr Val Lys Thr Phe Ala Asp Tyr Leu Gly 
% 70 75 80 

Gin Ser He Asp Gin Leu Glu Pro Phe Ser Leu He He Thr Gly Lys 

85 90 95 

Met Glu Arg Asp Gly Lys Lys Phe Gly He Gly Ser Ser Gly Ser Val 

100 105 HO 

Thr Leu Leu Thr Leu Lys Ala Leu Ser Ala Tyr Tyr Gin He Thr Leu 
115 120 125 

Thr Pro Glu Leu Leu Phe Lys Leu Ala Ala Tyr Thr Leu Leu Lys Gin 
130 135 140 

Gly Asp Asn Gly Ser Met Gly Asp He Ala Cys He Ala Tyr Gin Thr 
145 150 155 160 

Leu Val Ala Tyr Thr Ser Phe Asp Arg Glu Gin Val Ser Asn Trp Leu 

165 170 175 

Gin Thr Met Pro Leu Lys Lys Leu Leu Val Lys Asp Trp Gly Tyr His 

180 185 190 

He Gin Val He Gin Pro Ala Leu Pro Cys Asp Phe Leu Val Gly Trp 
195 200 205 

Thr Lys He Pro Ala He Ser Arg Gin Met He Gin Gin Val Thr Ala 
210 215 220 

Ser He Thr Pro Ala Phe Leu Arg Thr Ser Tyr Gin Leu Thr Gin Ser 
225 230 235 240 

Ala Met Val Ala Leu Gin Glu Gly His Lys Glu Glu Leu Lys Lys Ser 

245 250 255 

Leu Ala Gly Ala Ser His Leu Leu Lys Glu Leu His Pro Ala He Tyr 
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260 265 270 

His Pro Lys Leu Val Thr Leu Val Ala Ala Cys Gin Lys Gin Asp Ala 
275 280 285 

Val Ala Lys Ser Ser Gly Ala Gly Gly Gly Asp Cys Gly He Ala Leu 
290 295 300 

Ala Phe Asn Gin Asp Ala Arg Asp Thr Leu He Ser Lys Trp Gin Glu 
305 310 315 320 

Ala Asp He Ala Leu Leu Tyr Gin Glu Arg Trp Gly Glu Asn Asp 

325 330 335 

<210> 107 
<211> 368 
<212> PRT 

<213> Enterococcus faecalis 



<400> 107 

Met He Glu Val Thr Thr Pro Gly Lys Leu Phe He Ala Gly Glu Tyr 
1 5 10 15 

Ala Val Val Glu Pro Gly His Pro Ala He He Val Ala Val Asp Gin 

20 25 30 

Phe Val Thr Val Thr Val Glu Glu Thr Thr Asp Glu Gly Ser He Gin 
35 40 45 

Ser Ala Gin Tyr Ser Ser Leu Pro He Arg Trp Thr Arg Arg Asn Gly 
50 55 60 

Glu Leu Val Leu Asp He Arg Glu Asn Pro Phe His Tyr Val Leu Ala 
65 70 75 80 

Ala He His Leu Thr Glu Lys Tyr Ala Gin Glu Gin Asn Lys Glu Leu 

85 90 95 

Ser Phe Tyr His Leu Lys Val Thr Ser Glu Leu Asp Ser Ser Asn Gly 

100 105 HO 

Arg Lys Tyr Gly Leu Gly Ser Ser Gly Ala Val Thr Val Gly Thr Val 
115 120 125 

Lys Ala Leu Asn He Phe Tyr Asp Leu Gly Leu Glu Asn Glu Glu He 
130 135 140 

Phe Lys Leu Ser Ala Leu Ala His Leu Ala Val Gin Gly Asn Gly Ser 
145 150 155 160 

Cys Gly Asp He Ala Ala Ser Cys Tyr Gly Gly Trp He Ala Phe Ser 
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Thr Phe Asp His 

180 

Thr Asp Leu Leu 
195 

Lys Val Pro Lys 
210 

Ala Ser Thr Ser 
225 

Lys Gin Ala Ala 



Glu Thr Met lie 

260 

Lys Gin lie Thr 
275 

Thr Gly Val Val 
290 

Ala Glu Ser Tyr 
305 

Asp Cys Gly He 



Met Thr Ala Trp 

340 

Tyr Thr Tyr Gly 
355 

<210> 108 

<211> 361 

<212> PRT 

<213> Enteroco 



165 

Asp Trp Val Asn 



Ala Met Asp Trp 

200 

Gin Leu Arg Leu 
215 

Asp Leu Val Asp 
230 

Tyr Glu Gin Phe 
245 

Asn Gly Phe Asn 



Lys Asn Arg Gin 

280 

He Glu Thr Glu 
295 

Thr Gly Ala Ala 
310 

Val He Phe Arg 
325 

Glu Lys Asp Gly 



Gin Lys Glu Cys 

360 



f aecium 



170 

Gin Lys Val Thr 
185 

Pro Glu Leu Met 



Leu He Gly Trp 

220 

Arg Val His Gin 
235 

Leu Met Lys Ser 
250 

Thr Gly Lys He 
265 

Leu Leu Ala Glu 



Ala Leu Lys Asn 

300 

Lys Ser Ser Gly 
315 

Gin Lys Ser Gly 
330 

He Thr Pro Leu 
345 

Lys Glu Lys His 



175 

Thr Glu Thr Leu 
190 

lie Phe Pro Leu 
205 

Thr Gly Ser Pro 



Ser Lys Glu Glu 

240 

Arg Leu Cys Val 
255 

Ser Val He Gin 
270 

Leu Ser Ser Leu 
285 

Leu Cys Asp Leu 



Ala Gly Gly Gly 

320 

He Leu Pro Leu 
335 

Pro Leu His Val 
350 

Glu Ser Lys Arg 
365 



<400> 108 

Met He Glu Val 
1 

Ala Val Val Glu 

20 

Phe Val Thr Val 



Ser Ala Pro Gly 
5 

Thr Gly His Pro 
Thr Val Glu Ser 



Lys Leu Tyr He 
10 

Ala Val He Ala 
25 

Ala Arg Lys Val 

138 



Ala Gly Glu Tyr 
15 

Ala Val Asp Gin 
30 

Gly Ser He Gin 
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35 40 45 

Ser Ala Gin Tyr Ser Gly Met Pro Val Arg Trp Thr Arg Arg Asn Gly 
50 55 60 

Glu Leu Val Leu Asp lie Arg Glu Asn Pro Phe His Tyr lie Leu Ala 
65 70 75 80 

Ala lie Arg Leu Thr Glu Lys Tyr Ala Gin Glu Lys Asn lie Leu Leu 

85 90 95 

Ser Phe Tyr Asp Leu Lys Val Thr Ser Glu Leu Asp Ser Ser Asn Gly 

100 105 110 

Arg Lys Tyr Gly Leu Gly Ser Ser Gly Ala Val Thr Val Ala Thr Val 
115 120 125 

Gh Lys Ala Leu Asn Val Phe Tyr Ala Leu Asn Leu Ser Gin Leu Glu lie 

y 130 135 140 

Phe Lys lie Ala Ala Leu Ala Asn Leu Ala Val Gin Asp Asn Gly Ser 
145 150 155 160 

Cys. Gly Asp lie Ala Ala Ser Cys Tyr Gly Gly Trp lie Ala Phe Ser 

165 170 175 

Thr Phe Asp His Pro Trp Leu Gin Glu Gin Glu Thr Gin His Ser lie 

180 185 190 

Ser Glu Leu Leu Ala Leu Asp Trp Pro Gly Leu Ser lie Glu Pro Leu 
195 200 205 

lie Ala Pro Glu Asp Leu Arg Leu Leu lie Gly Trp Thr Gly Ser Pro 
210 215 220 

Ala Ser Thr Ser Asp Leu Val Asp Gin Val His Arg Ser Arg Glu Asp 
225 230 235 240 

Lys Met Val Ala Tyr Gin Leu Phe Leu Lys Asn Ser Thr Glu Cys Val 

245 250 255 

Asn Glu Met lie Lys Gly Phe Lys Glu Asn Asn Val Thr Leu lie Gin 

260 265 270 

Gin Met lie Arg Lys Asn Arg Gin Leu Leu His Asp Leu Ser Ala lie 
275 280 285 

Thr Gly Val Val lie Glu Thr Pro Ala Leu Asn Lys Leu Cys Asn Leu 
290 295 300 

Ala Glu Gin Tyr Glu Gly Ala Ala Lys Ser Ser Gly Ala Gly Gly Gly 
305 310 315 320 

Asp Cys Gly lie Val lie Val Asp Gin Lys Ser Gly lie Leu Pro Leu 

325 330 335 

Met Ser Ala Trp Glu Lys Ala Glu lie Thr Pro Leu Pro Leu His Val 



c 
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340 



345 350 



Tyr Ser Asp Gin Arg Lys Glu Asn Arg 
355 360 

<210> 109 

<211> 358 

<212> PRT 

<213> Staphylococcus haemolyticus 



<400> 109 

Met lie Gin Val 
1 

Ala Val Thr Glu 

20 

Phe Val Thr Ala 
35 

His Ser Lys Thr 
50 

Asp Lys lie Asp 
65 

Val Val Thr Ala 



Val Lys Leu Lys 

100 



Ala Ser Gly Asn 
115 

Ser Val Val Lys 
130 

Leu Tyr lie Tyr 
145 

Leu Ser Ser Cys 



Ala Tyr Ser Thr 

180 



Thr Ser Val Asn 
195 

Glu Pro Leu Gin 



Lys Ala Pro Gly 
5 

Pro Gly Tyr Lys 



Ser lie Glu Ala 

40 



Leu His Tyr Glu 
55 

lie Ser Asp Ala 
70 

lie Glu Val Phe 
85 

His Phe His Leu 



Lys Tyr Gly Leu 

120 



Ala Leu Asn Glu 
135 

Lys Leu Ala Val 
150 

Gly Asp He Ala 
165 

Phe Asp His Asp 



Glu Val Leu Glu 

200 



Ala Pro Glu Asn 



Lys Leu Tyr Val 
10 

Ser Val Leu He 
25 

Ser Asn Ala Val 



Pro Val Thr Phe 

60 

Asn Ala Ala Ser 
75 

Glu Gin Tyr Ala 
90 

Glu He Asp Ser 
105 

Gly Ser Ser Ala 



Phe Tyr Asp Met 

140 



He Ser Asn Met 
155 

Val Ser Val Tyr 
170 

Trp Val Lys Gin 
185 

Lys Asn Trp Pro 



Met Glu Val Leu 



140 



Ala Gly Glu Tyr 
15 

Ala Val Asp Arg 
30 

Thr Ser Thr He 
45 

Asn Arg Asn Glu 



Gin Leu Lys Tyr 

80 



Arg Ser Cys Asn 
95 

Asn Leu Asp Asp 
110 

Ala Val Leu Val 
125 

Gin Leu Ser Asn 



Arg Leu Gin Ser 

160 



Ser Gly Trp Leu 
175 

Gin Met Glu Glu 
190 

Gly Leu His He 
205 

He Gly Trp Thr 
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* -4 



G 



210 215 220 

Gly Ser Pro Ala Ser Ser Pro His Leu Val Ser Glu Val Lys Arg Leu 
225 2 30 235 240 

Lys Ser Asp Pro Ser Phe Tyr Gly Arg Phe Leu Asp Gin Ser His Thr 

245 250 255 

Cys Val Glu Asn Leu lie Tyr Ala Phe Lys Thr Asp Asn He Lys Gly 

260 265 270 

Val Gin Lys Met He Arg Gin Asn Arg Met He He Gin Gin Met Asp 
275 280 285 

Asn Glu Ala Thr Val Asp He Glu Thr Glu Asn Leu Lys Met Leu Cys 
2 ^0 295 300 

Asp He Gly Glu Arg Tyr Gly Ala Ala Ala Lys Thr Ser Gly Ala Gly 
305 310 315 320 

Gly Gly Asp Cys Gly He Ala He He Asp Asn Arg He Asp Lys Asn 

325 330 335 

Arg He Tyr Asn Glu Trp Ala Ser His Gly He Lys Pro Leu Lys Phe 

340 345 350 

Lys He Tyr His Gly Gin 
355 

<210> 110 

<211> 358 

<212> PRT 

<213> Staphylococcus epidermis 



<400> 110 

Met He Gin Val Lys Ala Pro Gly Lys Leu Tyr He Ala Gly Glu Tyr 
1 5 10 15 

Ala Val Thr Glu Pro Gly Tyr Lys Ser lie Leu He Ala Val Asn Arg 

2 0 25 30 

Phe Val Thr Ala Thr He Glu Ala Ser Asn Lys Val Glu Gly Ser He 
35 40 45 

His Ser Lys Thr Leu His Tyr Glu Pro Val Lys Phe Asp Arg Asn Glu 
50 55 60 

Asp Arg He Glu He Ser Asp Val Gin Ala Ala Lys Gin Leu Lys Tyr 
65 70 75 80 

Val Val Thr Ala He Glu Val Phe Glu Gin Tyr Val Arg Ser Cys Asn 
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85 



90 95 



Met Asn Leu Lys His Phe His Leu Thr lie Asp Ser Asn Leu Ala Asp 

100 

Asn Ser Gly Gin Lys Tyr Gly Leu Gly Ser Ser Ala Ala Val Leu Val 
115 120 

Ser Val Val Lys Ala Leu Asn Glu Phe Tyr Gly Leu Glu Leu Ser Asn 
acl i -ic 140 

130 135 

L eu Tyr lie Tyr Lys Leu Ala Val He Ala Asn Met Lys Leu Gin Ser 
145 150 155 

L eu Ser Ser Cys Gly Asp He Ala Val Ser Val Tyr Ser Gly Trp Leu 

165 170 

Ala Tyr Ser Thr Phe Asp His Asp Trp Val Lys Gin Gin Met Glu Glu 

180 185 

Thr Ser Val Asn Asp Val Leu Glu Lys Asn Trp Pro Gly Leu His He 
195 

Glu Pro Leu Gin Ala Pro Glu Asn Met Glu Val Leu He Gly Trp Thr 
210 215 220 

Gly Ser Pro Ala Ser Ser Pro His Leu Val Ser Glu Val Lys Arg Leu 



225 



230 



Lys Ser Asp Pro Ser Phe Tyr Gly Asp Phe Leu Asp Gin Ser His Ala 

245 250 



Cys Val Glu Ser Leu He Gin Ala Phe Lys Thr Asn Asn He Lys Gly 

<2 5 



260 



Val Gin Lys Met He Arg He Asn Arg Arg lie He Gin Ser Met Asp 
275 280 285 

Asn Glu Ala Ser Val Glu He Glu Thr Asp Lys Leu Lys Lys Leu Cys 
290 295 300 

Asp Val Gly Glu Lys His Gly Gly Ala Ser Lys Thr Ser Gly Ala Gly 

ii n Jib ■** v 

305 310 

Gly Gly Asp Cys Gly He Thr He He Asn Lys Val He Asp Lys Asn 

325 

He lie Tyr Asn Glu Trp Gin Met Asn Asp He Lys Pro Leu Lys Phe 

340 34 ^ 

Lys He Tyr His Gly Gin 
355 

<210> HI 
<211> 358 



\ 
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<212> PRT 

<213> Staphylococcus aureus 



<400> 111 

Met lie Gin Val Lys Ala Pro Gly Lys Leu Tyr lie Ala Gly Glu Tyr 
15 10 15 

Ala Val Thr G".u Pro Gly Tyr Lys Ser Val Leu lie Ala Leu Asp Arg 

-0 25 30 

Phe Val Thr Ala Thr lie Glu Glu Ala Thr Gin Tyr Lys Gly Thr lie 
35 40 45 

OHis Ser Lys Ala Leu His His Asn Pro Val Thr Phe Ser Arg Asp Glu 
50 55 60 

Asp Ser He Val He Ser Asp Pro His Ala Ala Lys Gin Leu Asn Tyr 
65 70 75 80 

Val Val Thr Ala He Glu He Phe Glu Gin Tyr Ala Lys Ser Cys Asp 

85 90 95 

He Ala Met Lys His Phe His Leu Thr lie Asp Ser Asn Leu Asp Asp 

100 105 110 

Ser Asn Gly His Lys Tyr Gly Leu Gly Ser Ser Ala Ala Val Leu Val 
115 120 125 

Ser Val He Lys Val Leu Asn Glu Phe Tyr Asp Met Lys Leu Ser Asn 
130 135 140 



r 



Leu Tyr He Tyr Lys Leu Ala Val He Ala Asn Met Lys Leu Gin Ser 
145 150 155 160 

Leu Ser Ser Cys Gly Asp He Ala Val Ser Val Tyr Ser Gly Trp Leu 

165 170 175 

Ala Tyr Ser Thr Phe Asp His Glu Trp Val Lys His Gin He Glu Asp 

180 185 190 

Thr Thr Val Glu Glu Val Leu He Lys Asn Trp Pro Gly Leu His He 
195 200 205 

Glu Pro Leu Gin Ala Pro Glu Asn Met Glu Val Leu He Gly Trp Thr 
210 215 220 

Gly Ser Pro Ala Ser Ser Pro His Phe Val Ser Glu Val Lys Arg Leu 
225 230 235 240 

Lys Ser Asp Pro Ser Phe Tyr Gly Asp Phe Leu Glu Asp Ser His Arg 

245 250 255 

Cys Val Glu Lys Leu lie His Ala Phe Lys Thr Asn Asn lie Lys Gly 
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260 265 270 

Val Gin Lys Met Val Arg Gin Asn Arg Thr lie lie Gin Arg Met Asp 



275 



280 



Lys Glu Ala Thr Val Asp He Glu Thr Glu Lys Leu Lys Tyr Leu Cys 
290 295 300 



Asp He Ala Glu Lys Tyr His Gly Ala Ser Lys Thr Ser Gly Ala Gly 
305 310 315 

Gly Gly Asp Cys Gly He Thr He He Asn Lys Asp Val Asp Lys Glu 

oor 330 J J 



325 

L ys lie Tyr Asp Glu Trp Thr Lys His Gly lie Lys Pro Leu Lys Phe 

3 

Asn He Tyr His Gly Gin 
355 

<210> 112 

<211> 374 

<212> PRT 

<213> streptomyces sp. CL190 



C) 



<400> 112 

Met Thr Thr Gly Gin Arg Thr He Val Arg His Ala Pro Gly Lys Leu 
1 5 10 15 

Phe Val Ala Gly Glu Tyr Ala Val Val Asp Pro Gly Asn Pro Ala He 

20 25 

Leu Val Ala Val Asp Arg His He Ser Val Thr Val Ser Asp Ala Asp ( 
35 40 45 

Ala Asp Thr Gly Ala Ala Asp Val Val He Ser Ser Asp Leu Gly Pro 

50 55 
Gin Ala Val Gly Trp Arg Trp His Asp Gly Arg Leu Val Val Arg Asp 
65 70 75 



Pro Asp Asp Gly Gin Gin Ala Arg Ser Ala Leu Ala His Val Val Ser 

85 90 

Ala He Glu Thr Val Gly Arg Leu Leu Gly Glu Arg Gly Gin Lys Val 

100 ^5 

Pro Ala Leu Thr Leu Ser Val Ser Ser Arg Leu His Glu Asp Gly Arg 
Lys Phe Gly Leu Gly Ser Ser Gly Ala Val Thr Val Ala Thr Val Ala 
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G 



130 135 140 

Ala Val Ala Ala Phe Cys Gly Leu Glu Leu Ser Thr Asp Glu Arg Phe 
I 45 150 155 160 

Arg Leu Ala Met Leu Ala Thr Ala Glu Leu Asp Pro Lys Gly Ser Gly 

165 170 175 

Gly Asp Leu Ala Ala Ser Thr Trp Gly Gly Trp lie Ala Tyr Gin Ala 

180 185 190 

Pro Asp Arg Ala Phe Val Leu Asp Leu Ala Arg Arg Val Gly Val Asp 
195 200 205 

Arg Thr Leu Lys Ala Pro Trp Pro Gly His Ser Val Arg Arg Leu Pro 
210 215 220 

Ala Pro Lys Gly Leu Thr Leu Glu Val Gly Trp Thr Gly Glu Pro Ala 
225 230 235 240 

Ser Thr Ala Ser Leu Val Ser Asp Leu His Arg Arg Thr Trp Arg Gly 

245 250 255 

Ser Ala Ser His Gin Arg Phe Val Glu Thr Thr Thr Asp Cys Val Arg 

260 265 270 

Ser Ala Val Thr Ala Leu Glu Ser Gly Asp Asp Thr Ser Leu Leu His 
275 280 285 

Glu lie Arg Arg Ala Arg Gin Glu Leu Ala Arg Leu Asp Asp Glu Val 
290 295 300 

Gly Leu Gly lie Phe Thr Pro Lys Leu Thr Ala Leu Cys Asp Ala Ala 
305 310 315 320 

Glu Ala Val Gly Gly Ala Ala Lys Pro Ser Gly Ala Gly Gly Gly Asp 

325 330 335 

Cys Gly lie Ala Leu Leu Asp Ala Glu Ala Ser Arg Asp lie Thr His 

340 345 350 

Val Arg Gin Arg Trp Glu Thr Ala Gly Val Leu Pro Leu Pro Leu Thr 
355 360 365 

Pro Ala Leu Glu Gly lie 
370 

<210> 113 

<211> 360 

<212> PRT 

<213> Streptomyces griseolosporeus 
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<400> 113 

Met Thr Gly Pro Arg Ala Val Thr Arg Arg Ala Pro Gly Lys Leu Phe 
x 5 io 15 

Val Ala Gly Glu Tyr Ala Val Val Glu Pro Gly Asn Arg Ala lie Leu 

20 25 30 

Val Ala Val Asp Arg Tyr Val Thr Val Thr Val Ser Asp Gly Ala Ala 
35 40 45 

Pro Gly Val Vil Val Ser Ser Asp He Gly Ala Gly Pro Val His His 
50 55 60 

Pro Trp Gin Asp Gly Arg Leu Thr Gly Gly Thr Gly Thr Pro His Val 
65 70 75 80 

Val Ala Ala Val Glu Thr Val Ala Arg Leu Leu Ala Glu Arg Gly Arg f ^ 

85 90 95 

Ser Val Pro Pro Leu Gly Trp Ser He Ser Ser Thr Leu His Glu Asp 

100 105 HO 

Gly Arg Lys Phe Gly Leu Gly Ser Ser Gly Ala Val Thr Val Ala Thr 
115 120 125 

Val Ser Ala Val Ala Ala His Cys Gly Leu Glu Leu Thr Ala Glu Glu 
130 135 140 

Arg Phe Arg Thr Ala Leu He Ala Ser Ala Arg He Asp Pro Arg Gly 
145 150 155 160 

Ser Gly Gly Asp He Ala Thr Ser Thr Trp Gly Gly Trp He Ala Tyr 

165 170 175 

Arg Ala Pro Asp Arg Asp Ala Val Leu Asp Leu Thr Arg Arg Gin Gly 

180 185 190 



Val Asp Glu Ala Leu Arg Ala Pro Trp Pro Gly Phe Ser Val Arg Leu 

195 200 205 

Ser Pro Pro Arg Asn Leu Cys Leu Glu Val Gly Trp Thr Gly Asn Pro 

210 215 220 

Val Ser Thr Thr Ser Leu Leu Thr Asp Leu His Arg Arg Thr Trp Arg 
225 230 235 240 

Glv Ser Pro Ala Tyr Arg Arg Tyr Val Gly Ala Thr Gly Glu Leu Val 

245 250 255 

Asd Ala Ala Val He Ala Leu Glu Asp Gly Asp Thr Glu Gly Leu Leu 

260 265 270 

Arg Gin Val Arg Arg Ala Arg His Glu Met Val Arg Leu Asp Asp Glu 

275 280 285 



r 



Val Gly Leu 



Gly He Phe Thr Pro Glu Leu Thr Ala Leu Cys Ala He 



146 



WO 02/099095 



PCT/EP02/06171 



290 295 300 

Ala Glu Arg Ala Gly Ala Ala Lys Pro Ser Gly Ala Gly Gly Gly Asp 
305 310 315 320 

Cys Gly lie Ala Leu Leu Asp Ala Glu Ala Arg Tyr Asp Arg Ser Pro 

325 330 335 

Leu His Arg Gin Trp Ala Ala Ala Gly Val Leu Pro Leu Leu Val Ser 

340 345 350 

Pro Ala Thr Glu Gly Val Glu Glu 

360 





355 


<210> 


114 


<211> 


317 


<212> 


PRT 


<213> 


Borrelia 



<400> 114 

Met Asp Leu lie Ser Phe Ser Val Pro Gly Asn Leu Leu Leu Met Gly 
15 10 15 

Glu Tyr Thr He Leu Glu Glu Lys Gly Leu Gly Leu Ala He Ala lie 

20 25 30 

Asn Lys Arg Ala Phe Phe Ser Phe. Lys Lys Ser Asp Ser Trp Arg Phe 
35 40 45 

Phe Ser Lys Lys Lys Lys He Asp Asp Phe Ser Leu He Glu Asn Arg 
50 55 60 

Ser Asp Phe Val Phe Lys Met Phe Ala Tyr Leu Ser Gin Asn Cys Phe 
65 70 75 80 

Phe Asn Leu Glu Asn Phe Ala Tyr Asp Val Tyr He Asp Thr Ser Asn 

85 90 95 

Phe Phe Phe Asn Asp Gly Thr Lys Lys Gly Phe Gly Ser Ser Ala Val 

100 105 no 

Val Ala He Gly He Val Cys Gly Leu Phe Leu He His Asn Ala Thr 
115 120 125 

Asn Val Val Glu Lys Gly Glu He Phe Lys Tyr Cys Leu Glu Ala Tyr 
130 135 140 

Arg Tyr Ser Gin Gly Gly He Gly Ser Gly Tyr Asp He Ala Thr Ser 
145 150 155 160 

He Phe Gly Gly Val He Glu Phe Glu Gly Gly Phe Asn Pro Lys Cys 
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165 170 



175 



Arg Gin Leu Gly Ala Val Glu Phe Asn Asp Phe Tyr Leu Met Gin Gly 

180 185 190 

Leu Gin Ala He Lys Thr Thr Thr Ser He Cys Glu Tyr Asn Lys His 
195 200 205 

Arg Asn Ser He Leu Asp Phe He Leu Lys Cys Asn Leu Glu Met Lys 



210 



215 



Lys Leu Val Leu Asn Ala Ser Asn Ser Lys Ser Ala Leu He Ser Ser 
225 230 

Leu Arg Arg Ala Lys Glu Leu Gly Leu .Ala He Gly Glu Ala He Gly 

245 250 

Val Ser Ala Ala Leu Pro Ser Ser Phe Asp His Leu Leu Gly Gin Cys 

260 265 270 

Asp Leu He Lys Ala Leu Gly Ala Gly Asn Glu Thr Phe Leu Val Tyr 
275 280 285 

Ara Pro Asn He Glu Ala Phe Asn Leu Ser Lys He He Ser He Val 
290 295 300 

Leu Glu Asn Glu Gly He Lys Phe Glu Ser Asp Lys Cys 
305 31° 315 

<210> 115 

<211> 30 

<212> DNA 

<213> synthetic construct 



<400> 115 

gggcaagctt gtccacggca cgaccaagca 



<210> 


116 


<211> 


30 


<212> 


DNA 


<213> 


synthetic construct 



<400> 116 

cgtaatccgc ggccgcgttt ccagcgcgtc 



30 



30 
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<210> 117 

<211> 28 

<212> DNA 

<213> synthetic construct 



<400> 117 

aattaaagga gggtttcata tgaattcg 28 

<210> 118 

<211> 28 

<212> DNA 

<213> synthetic construct 



<400> 118 

gatccgaatt catatgaaac cctccttt 28 

<210> 119 

<211> 30 

<212> DNA 

<213> synthetic construct 



<400> 119 

aaggcctcat atgatttccc ataccccggt 3q 

<210> 120 

<211> 28 

<212> DNA 

<213> synthetic construct 



<400> 120 

cgggatcctc atcgctccat ctccatgt 28 
<210> 121 
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<211> 30 
<212> DNA 

<213> synthetic construct 



<400> 121 

aaggcctcat atgaccgaca gcaaggatca 



<210> . 122 
<211> 28 
<212> DNA 

<213> synthetic construct 



<400> 122 

cgggatcctc attgacggat aagcgagg 



<210> 123 

<211> 29 

<212> DNA 

<213> synthetic construct 



<400> 123 

aaggcctcat atgaaagtgc ctaagatga 



<210> 124 

<211> 28 

<212> DNA 

<213> synthetic construct 



<400> 124 

cgggatcctc aggcctgccg gtcgacat 



<210> 125 



30 



28 



29 



28 
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<211> 34 



<212> DNA 



<213> synthetic construct 



<400> 125 

aaggcctcat atgagcaccg gcaggcctga agca 



34 



<210> 126 



<211> 31 



<212> DNA 



<213> synthetic construct 



<400> 126 

cgggatcctc atccctgccc cggcagcggt t 



31 



<210> 127 



<211> 30 



<212> DNA 



<213> synthetic construct 



G 



<400> 127 

aaggcctcat atggatcagg tcatccgcgc 



30 



<210> 128 



<211> 28 



<212> DNA 



<213> synthetic construct 



<400> 128 

cgggatcctc agtcatcgaa aacaagtc 



28 



<210> 129 
<211> 30 
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<212> DNA 

<213> synthetic construct 



<400> 129 

aaggcctcat atgactgatg ccgtccgcga 



<210> 130 

<211> 28 

<212> DNA 

<213> synthetic construct 



<400> 130 

cgggatcctc aacgcccctc gaacggcg 



<210> 131 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 131 

ccggcattcg ggcggcatcc aggtctcgct g 



<210> 132 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 132 

cagcgagacc tggatgccgc ccgaatgccg g 



<210> 133 
<211> 31 



30 



28 



31 



{ 



31 
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o 



<212> DMA 

<213> synthetic construct 



<400> 133 

cgtgcagggc tggattctgt cggaataccc g 31 

<210> 134 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 134 

cgggtattcc gacagaatcc agccctgcac g 31 

<210> 135 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 135 

gggctgcgcg ccggcatccg gcatttcgac g 31 

<210> 136 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 136 

cgtcgaaatg ccggatgccg gcgcgcagcc c 31 

<210> 137 
<211> 31 
<212> DNA 
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<213> synthetic construct 



<400> 137 

gggtgcgacg ggcgagttct tcgatgcgcg g 



<210> 138 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 138 

ccgcgcatcg aagaactcgc ccgtcgcacc c 



<210> 139 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 139 

cacgcccgtc acatacgacg aatacgttgc c 



<210> 140 

<211> 31 

<212> DNA 

<213> synthetic construct 



<400> 140 

ggcaacgtat tcgtcgtatg tgacgggcgt g 



<210> 141 
<211> 31 
<212> DNA 



31 



31 



31 



31 
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<213> synthetic construct 



<400> 141 

gaggctcggg cttggctcct cggcggcggt g 



31 



<210> 



142 



<211> 



31 



<212> DNA 



<2 1 3> synthetic construct 




<400> 142 

caccgccgcc gaggagccaa gcccgagcct c 



31 



<210> 143 

<211> 31 

<212> DNA 

<213> synthetic construct 

<400> 143 

cggcacgctg ctggacccgg gcgacgcctt c 

<210> 144 



<212> DNA 

<213> synthetic construct 

<400> 144 

gaaggcgtcg cccgggtcca gcagcgtgcc g 

<210> 145 

<211> 36 

<212> DNA 

<213> synthetic construct 



r 



<211> 



31 



155 
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tcagaattcg gtaccatatg aagcttggat ccgggg 



36 



<210> 146 
<211> 29 



<212> DNA 



<213> synthetic construct 



<400> 146 

ggatccaagc ttcatatggt accgaattc 



29 



<210> 147 
<211> 26 



<212> DNA 



<213> synthetic construct 



<400> 147 

ggaattcgct gctgaacgcg atggcg 



26 



<210> 148 
<211> 32 



r 



<212> DNA 

<213> synthetic construct 



<400> 148 

ggggtaccat atgtgccttc gttgcgtcag tc 



32 



<210> 149 
<211> 50 



<212> DNA 



<213> synthetic construct 
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W>18 Rec'ci PC7/PTG 02 DEC 20flS 

<400> 149 

gatccggcgt gtgcgcaatt taattgcgca cacgccccct gcgtttaaac 5 

<210> 150 
<211> 50 
<212> DNA 

<213> synthetic construct 
<400> 150 

gatcgtttaa acgcaggggg cgtgtgcgca attaaattgc gcacacgccg 5 
<210> 151 

« ». 

<211> 30 ; * 
<212> DNA 

<213> synthetic construct 
<400> 151 

aaggcctcat atgacgccca agcagcaatt 3 

<210> 152 
<211> 26 
<212> DNA 

<213> synthetic construct 
<400> 152 

cgggatccta ggcgctgcgg cggatg 2 

<210> 153 

<211> 30 

<212> DNA o 

<213> synthetic construct 
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<400> 153 

ccggatcctc atgcctgccg gtcgacatag 



30 



<210> 154 

<211> 30 

<212> DNA 

<213> synthetic construct 

<400> 154 

gaaggcacat atgaatcagg tcatccgcgc 3 0 

<210> 155 

<211> 30 

<212> DNA 



<210> 156 

<211> 30 

<212> DNA 

<213> synthetic construct 

<400> 156 

acgccggatc ctcatcgccc ctcgaacggc 30 

<210> 157 

<211> 1612 

<212> DNA 

<213> Paracoccus sp. R114 



<213> 



synthetic construct 



<400> 155 

gccggatcct cattcatcga aaacaagtcc 



30 



158 
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<220> 

<221> CDS 

<222> (59) . . (292) 

<223> XseB 



<220> 

<221> CDS 

<222> (1185) . . (1610) 

<223> IspA 




<220> 

<221> CDS 

<222> (295) . . (1158) 

<223> Dxs 



G 



<400> 157 

ccatggcatc cgggtcggat gccgtctatg ttggcccgaa caggcagcag gaggcccc 58 

atg age gat ate cag acc etc teg ttc gag gaa gee atg cgc gag ctg 106 
Met Ser Asp lie Gin Thr Leu Ser Phe Glu Glu Ala Met Arg Glu Leu 
15 10 15 

gag gcg acc gtc ggc aag ctg gaa acc ggc gag gcg acg etc gag gac 154 
Glu Ala Thr Val Gly Lys Leu Glu Thr Gly Glu Ala Thr Leu Glu Asp 

20 25 30 

tec ate gcg etc tat gaa cgc ggg gcg gcg ctg cgc gee cat tgc gaa 2 02 

Ser He Ala Leu Tyr Glu Arg Gly Ala Ala Leu Arg Ala His Cys Glu 
35 40 45 

acc cgc ctg cgc gag gee gag gag egg gtc gag aag ate acc ctg gee 250 
Thr Arg Leu Arg Glu Ala Glu Glu Arg Val Glu Lys He Thr Leu Ala 
50 55 60 

gcg aac ggg cag ccg tec gga acc gag ccc gee gag ggc ctg tg atg 297 
Ala Asn Gly Gin Pro Ser Gly Thr Glu Pro Ala Glu Gly Leu Met 
65 • 70 75 © 

cag gee cgc ctg gee gag ate egg ccc ctg gtc gag gee gag ctg aac 345 
Gin Ala Arg Leu Ala Glu He Arg Pro Leu Val Glu Ala Glu Leu Asn 
80 85 90 95 
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gtc aaa tgg acc gag gcg acc gcg ate ctt gcg ggc gat gcg ctg cag 
Val Lys Trp Thr Glu Ala Thr Ala He Leu Ala Gly Asp Ala Leu Gin 

180 185 190 

acg ctg gec ttc cag ctg ctg gec gat ccg cgc gtg ggc gac gat gcg 
Thr Leu Ala Phe Gin Leu Leu Ala Asp Pro Arg Val Gly Asp Asp Ala 

195 200 205 

gcg egg atg egg ctg gtc ggt teg ctg gcg cag gca teg ggg get gcg 
Ala Arg Met Arg Leu Val Gly Ser Leu Ala Gin Ala Ser Gly Ala Ala 
210 215 220 



ggc gcg ctg ate cgc ttt gee gcg acc gee ggg ccg ctg atg gcg ggg 
Gly Ala Leu He Arg Phe Ala Ala Thr Ala Gly Pro Leu Met Ala Gly 

260 265 270 



ttc cag ate gcg gac gac ate ctg gac gtc gag ggc tgc gag gec gcg 
Phe Gin He Ala Asp Asp He Leu Asp Val Glu Gly Cys Glu Ala Ala 
290 295 300 

acc ggc aag cgc gtc ggc aag gat gcg gat gee aac aag gcg acc ttc 
Thr Gly Lys Arg Val Gly Lys Asp Ala Asp Ala Asn Lys Ala Thr Phe 
305 310 315 

gtc teg ctg ctg ggc etc gag ggg gcg egg tec gag gcg cgt cgc ctg 



160 



441 



489 



gee gec ate gac gcg ctg ccc gcg ggc gat ctg teg gat gcg atg cgc ^93 
Ala Ala He Asp Ala Leu Pro Ala Gly Asp Leu Ser Asp Ala Met Arg 

100 105 HO 

tat gec gtg cag ggc ggc aag egg ctg cgc gcg ttc ctg gtg atg gag 
Tvr Ala Val Gin Gly Gly Lys Arg Leu Arg Ala Phe Leu Val Met Glu 

115 120 125 

teg gcg cgc ctg cac ggg ctg gac gac gac gca teg ctg ccc gtc gee 
Ser Ala Arg Leu His Gly Leu Asp Asp Asp Ala Ser Leu Pro Val Ala 
130 135 140 

gee gcg gtc gag gcg ctg cac gee tac age ttg gtc cat gac gac ctg 537 
Ala Ala Val Glu Ala Leu His Ala Tyr Ser Leu Val His Asp Asp Leu 
145 150 155 

ccc gcg atg gat gac gac gac ctg egg cgc ggt cag ccc acc gtc cac 585 

Pro Ala Met Asp Asp Asp Asp Leu Arg Arg Gly Gin Pro Thr Val His ( ) 

160 165 170 175 



633 



681 



729 



ggc atg gtc tgg ggc cag gcg ctg gac ate gcg gee gag acc teg ggc 777 
Gly Met Val Trp Gly Gin Ala Leu Asp He Ala Ala Glu Thr Ser Gly 
225 230 235 



825 



gtg ccg ctg gat ctg gac gcg ate ate cgc ctg cag ggt ggc aag acc 
Val Pro Leu Asp Leu Asp Ala He He Arg Leu Gin Gly Gly Lys Thr / 
240 245 250 255 



873 



gcg gac cct gee gcg ctg gac gat tat gcg cag gee gtc ggg ctg gee 921 
Ala Asp Pro Ala Ala Leu Asp Asp Tyr Ala Gin Ala Val Gly Leu Ala 

275 280 285 



969 



1017 



1065 
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o 



c 



Val Ser Leu Leu Gly Leu Glu Gly Ala Arg Ser Glu Ala Arg Arg Leu 
320 325 330 335 

gcc gat gcg ggg cag gac gcg ctg gcg ggt tac ggc gat get gcg ggg 1113 
Ala Asp Ala Gly Gin Asp Ala Leu Ala Gly Tyr Gly Asp Ala Ala Gly 

340 345 350 

aac ctt egg gac ctg gcg cgc ttc gtg ate gaa cgc gac age tga 1158 
Asn Leu Arg Asp Leu Ala Arg Phe Val lie Glu Arg Asp Ser 

355 360 365 

tcgccgcctt cccgccaagg ggcaag atg atg acc gac gga ccc gca acc ccg 1211 

Met Met Thr Asp Gly Pro Ala Thr Pro 

370 

ate ctg gac cgc gtc cag cag cca tec gac ctg gca teg ctg gac gat 1259 
lie Leu Asp Arg Val Gin Gin Pro Ser Asp Leu Ala Ser Leu Asp Asp 
375 380 385 390 

gcg cag ctg cgc ctg ctg gcg gac gag ctg egg gcc gag acc ate gac 1307 
Ala Gin Leu Arg Leu Leu Ala Asp Glu Leu Arg Ala Glu Thr lie Asp 

395 400 405 

ate gtc age cgc acg ggc ggt cac ctg ggc gcg ggg ctg ggc gtg gtc 1355 
lie Val Ser Arg Thr Gly Gly His Leu Gly Ala Gly Leu Gly Val Val 

410 415 420 

gaa ctg acg gtc gcc ctg cac gcc gtc ttt egg gcg ccg cgc gac aag 1403 
Glu Leu Thr Val Ala Leu His Ala Val Phe Arg Ala Pro Arg Asp Lys 
425 430 435 

ate gtc tgg gac gtg ggg cat caa tgc tat ccc cac aag ate ctg acg 1451 
He Val Trp Asp Val Gly His Gin Cys Tyr Pro His Lys He Leu Thr 
440 445 450 

ggc agg egg gac egg atg cgc acg ctg cgc atg ggc ggc ggg ctg teg 1499 
Gly Arg Arg Asp Arg Met Arg Thr Leu Arg Met Gly Gly Gly Leu Ser 
455 460 465 470 

ggg ttc acc aag egg cag gaa age gcg ttc gat ccg ttc ggt gcg ggg 1547 
Gly Phe Thr Lys Arg Gin Glu Ser Ala Phe Asp Pro Phe Gly Ala Gly 

475 480 485 

cac age teg acc teg ate teg gcg gcg ctg ggc ttc gcg atg gcg cgt 1595 
His Ser Ser Thr Ser He Ser Ala Ala Leu Gly Phe Ala Met Ala Arg 

490 495 500 

gaa ctt ggc ggg gat cc 1612 
Glu Leu Gly Gly Asp 
505 



<210> 158 
<211> 78 
<212> PRT 



161 
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<213> Paracoccus sp. R114 



<400> 158 

Met Ser Asp lie Gin Thr Leu Ser Phe Glu Glu Ala Met Arg Glu Leu 



Ala Thr Val Gly Lys Leu Glu Thr Gly Glu Ala Thr Leu Glu Asp 



Glu 

20 



25 30 



Ser lie Ala Leu Tyr Glu Arg Gly Ala Ala Leu Arg Ala His Cys Glu 
35 40 45 

Thr Arg Leu Arg Glu Ala Glu Glu Arg Val Glu Lys He Thr Leu Ala 
50 55 60 

Ala Asn Gly Gin Pro Ser Gly Thr Glu Pro Ala Glu Gly Leu 
65 70 75 



<210> 


159 


<211> 


287 


<212> 


PRT 


<213> 


Paracoccus sp. R114 



<400> 159 

Met Gin Ala Arg Leu Ala Glu He Arg Pro Leu Val Glu Ala Glu Leu 

1 5 

Asn Ala Ala He Asp Ala Leu Pro Ala Gly Asp Leu Ser Asp Ala Met 

20 25 30 

Arg Tyr Ala Val Gin Gly Gly Lys Arg Leu Arg Ala Phe Leu Val Met 
- - 4 0 ^ 



35 



Glu Ser Ala Arg Leu His Gly Leu Asp Asp Asp Ala Ser Leu Pro Val 
50 



55 60 



Ala Ala Ala Val Glu Ala Leu His Ala Tyr Ser Leu Val His Asp Asp 



65 



162 
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Leu Pro Ala Met Asp Asp Asp Asp Leu Arg Arg Gly Gin Pro Thr Val 

85 90 95 

His Val Lys Trp Thr Glu Ala Thr Ala He Leu Ala Gly Asp Ala Leu 

100 105 110 

Gin Thr Leu Ala Phe Gin Leu Leu Ala Asp Pro Arg Val Gly Asp Asp 
115 120 125 

Ala Ala Arg Met Arg Leu Val Gly Ser Leu Ala Gin Ala Ser Gly Ala 
130 135 140 

Ala Gly Met Val Trp Gly Gin Ala Leu Asp He Ala Ala Glu Thr Ser 
145 150 155 160 

Gly Val Pro Leu Asp Leu Asp Ala He He Arg Leu Gin Gly Gly Lys 

165 170 175 

Thr Gly Ala Leu He Arg Phe Ala Ala Thr Ala Gly Pro Leu Met Ala 

180 185 190 

Gly Ala Asp Pro Ala Ala Leu Asp Asp Tyr Ala Gin Ala Val Gly Leu 
195 200 205 

Ala Phe Gin He Ala Asp Asp He Leu Asp Val Glu Gly Cys Glu Ala 
210 215 220 

Ala Thr Gly Lys Arg Val Gly Lys Asp Ala Asp Ala Asn Lys Ala Thr 
225 230 235 240 

Phe Val Ser Leu Leu Gly Leu Glu Gly Ala Arg Ser Glu Ala Arg Arg 

245 250 255 

Leu Ala Asp Ala Gly Gin Asp Ala Leu Ala Gly Tyr Gly Asp Ala Ala 

260 265 270 

Gly Asn Leu Arg Asp Leu Ala Arg Phe Val He Glu Arg Asp Ser 
275 280 285 

<210> 160 
<211> 142 



163 
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<212> PRT 

<213> Paracoccus sp. R114 



<400> 160 

Met Met Thr Asp Gly Pro Ala Thr Pro lie Leu Asp Arg Val Gin Gin 
1*5 10 15 



Pro Ser Asp Leu Ala Ser Leu Asp Asp Ala Gin Leu Arg Leu Leu Ala 

20 25 30 



Asp Glu Leu Arg Ala Glu Thr lie Asp lie Val Ser Arg Thr Gly Gly 
35 40 45 



His Leu Gly Ala Gly Leu Gly Val Val Glu Leu Thr Val Ala Leu His 
50 55 60 



Ala Val Phe Arg Ala Pro Arg Asp Lys lie Val Trp Asp Val Gly His 
65 70 75 80 



Gin Cys Tyr Pro His Lys lie Leu Thr Gly Arg Arg Asp Arg Met Arg 

85 90 95 



Thr Leu Arg Met Gly Gly Gly Leu Ser Gly Phe Thr Lys Arg Gin Glu 

100 105 110 



Ser Ala Phe Asp Pro Phe Gly Ala Gly His Ser Ser Thr Ser lie Ser 
115 120 125 



Ala Ala Leu Gly Phe Ala Met Ala Arg Glu Leu Gly Gly Asp 
130 135 140 



<210> 161 

<211> 6 

<212> PRT 

<213> Bradyrhizobium japonicum 



<400> 161 



164 
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* 




Val His Asp Asp Leu Pro 
1 5 

<210> 162 

<211> 6 

<212> PRT 

<213> Rhizobium sp. strain NGR234 



<400> 162 

Val His Asp Asp Leu Pro 
1 5 

<210> 163 

<211> 6 

<212> PRT 

<213> Bacillus stearothermophilus 



<400> 163 

He His Asp Asp Leu Pro 
1 5 

<210> 164 

<211> 6 

<212> PRT 

<213> Bacillus subtilis 



<400> 164 

He His Asp Asp Leu Pro 
1 5 

<210> 165 

<211> 6 

<212> PRT 

<213> Escherichia coli 



165 



BNSDOCIO: <WO 0209909SA2_I_> 



WO 02/099095 



PCT/EP02/06171 



<400> 165 

He His Asp Asp Leu Pro 
1 5 

<210> 166 

<211> 6 

<212> PRT 

<213> Haemophilus influenzae 



<400> 166 

He His Asp Asp Leu Pro 
1 5 

<210> 167 

<211> 16 

<212> DNA 

<213> synthetic construct 



<400> 167 
tccaygayga yctgcc 



<210> 168 

<211> 5 

<212> PRT 

<213> Bradyrhizobium japonicum 



16 



<400> 168 

Asp Asp He Leu Asp 
1 5 

<210> 169 

<211> 5 

<212> PRT 

<213> Rhizobium sp. strain NGR234 



166 
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c 



<400> 169 

Asp Asp He Leu Asp 
1 5 

<210> 170 

<211> 5 

<212> PRT 

<213> Bacillus stearothermophilus 



<400> 170 

Asp Asp He Leu Asp 
1 5 

<210> 171 

<211> 5 

<212> PRT 

<213> Bacillus subtilis 



<400> 171 

Asp Asp He Leu Asp 
1 5 

(3' <210> 172 

<211> 5 

<212> PRT 

<213> Escherichia coli 



<400> 172 

Asp Asp He Leu Asp 
1 5 

<210> 173 

<211> 5 



167 
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<212> PRT 

<213> Haemophilus influenzae 



<400> 173 

Asp Asp He Leu Asp 
1 * 

<210> 174 

<2H> 15 

<212> DNA 

<213> synthetic construct (' ; 



<400> 174 
gaygayatcc tggay 

<210> 175 

<211> H76 

<212> DNA 

<213> Paracoccus sp. R1534 



<220> 

<221> CDS 

<222> (1) . • (H73) 

<223> acety-CoA acetyltransf erase 



at^ac^cc ate gtc ate acc ggc gcg atg cgc acc ccg atg ggg gca 
Me? ?sp Pro lie ?•! He Thr Gly Ala Met Arg Thr Pro Met Gly Ala 
1 

ttc cag ggc gat ctt gcc gcg atg gat gcc ccg acc ctt ggc gcg gcc 
Pne Gin Gly Asp Leu Ala Ala Met Asp Ala Pro Thr Leu Gly Ala Ala 

20 25 30 

aca ate cac gcc gcg ctg aac ggc ctg teg ccc gac atg gtg gac gag 
III Til Arg Ala 111 Leu Asn Gly Leu Ser Pro Asp Met Val Asp Glu 
35 40 45 



15 



48 



96 



144 



168 
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gtg ctg atg ggc tgc gtc ctg ccc gcg ggc cag ggt cag gca ccg gca 
Val Leu Met Gly Cys Val Leu Pro Ala Gly Gin Gly Gin Ala Pro Ala 
50 55 60 

cgt cag gcg gcg ctt gac gcc gga ctg ccg ctg teg gcg ggc gcg acc 
Arg Gin Ala Ala Leu Asp Ala Gly Leu Pro Leu Ser Ala Gly Ala Thr 



65 70 



7 5 80 



acc ate aac aag atg tgc gga teg ggc atg aag gcc gcg atg ctg ggc 
Thr lie Asn Lys Met Cys Gly Ser Gly Met Lys Ala Ala Met Leu Glv 

85 go " 



95 



100 105 



110 



115 120 



125 



ggg atg cgc atg ggc cat gac cgt gtg ctg gat cac atg ttc etc gac 
Gly Met Arg Met Gly His Asp Arg Val Leu Asp His Met Phe Leu Asp 
13 0 135 



140 



165 170 



175 



tat gcg ctg acc age ctg gcc cgc gcg cag gac gcc ate gcc age ggt 
Tyr Ala Leu Thr Ser Leu Ala Arg Ala Gin Asp Ala He Ala Ser Gly 

180 185 



190 



c 



205 



^f 9 9t f 9at aC ° 9a ° gag atg ccc aa 9 S cc cgc ccc gag 

Gin Thr Thr Val Asp Thr Asp Glu Met Pro Gly Lys Ala Arg Pro Glu 
210 215 



220 



192 



240 



288 



cat gac ctg ate gcc gcg gga teg gcg ggc ate gtc gtc gcc ggc ggg 336 
Hxs Asp Leu He Ala Ala Gly Ser Ala Gly He Val Val Ala Gly Gly 



O « tS of 9 390 3tg tCg 330 9CC CCC tac ctg ct 9 ccc aa 9 3cg egg teg 384 

^ Met Glu Ser Met Ser Asn Ala Pro Tyr Leu Leu Pro Lys Ala Arg Ser 



432 



ggg ttg gag gac gcc tat gac aag ggc cgc ctg atg ggc acc ttc gcc 480 
Gly Leu Glu Asp Ala Tyr Asp Lys Gly Arg Leu Met Gly Thr Phe Ala 
45 150 155 i 6 o 

gag gat tgc gcc ggc gat cac ggt ttc acc cgc gag gcg cag aac aac 
Glu Asp Cys Ala Gly Asp His Gly Phe Thr Arg Glu Ala Gin Asp Asp 



528 



576 



gcc ttc gcc gcc gag ate gcg ccc gtg acc gtc acg gca cgc aag gtg 624 
Ala Phe Ala Ala Glu He Ala Pro Val Thr Val Thr Ala Arg Lys Sal 
195 200 



672 



Itl o° C f tg f 39 CC ° 9CC ttC Cgt gaC ggt ggc ac 9 <? tc 

Lys He Pro His Leu Lys Pro Ala Phe Arg Asp Gly Gly Thr Val Thr 



720 



225 230 



235 



240 



gcg gcg aac age teg teg ate teg gac ggg gcg gcg gcg ctg gtg atg 
Ala Ala Asn Ser Ser Ser He Ser Asp Gly Ala Ala Ala Leu Val Met 

245 250 



768 



255 



atg cgc cag teg cag gcc gag aag ctg ggc ctg 

Met Arg Gin Ser Gin Ala Glu Lys Leu Gly Leu Thr Pro lie Ala Ara 



acg ccg ate gcg egg 816 



260 265 



270 



169 
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ate ate ggt cat gcg ace cat gee gac cgt ccc ggc ctg ttc ccg acg 

He He Gly His Ala Thr His Ala Asp Arg Pro Gly Leu Phe Pro Thr 

275 280 285 

gee ccc ate ggc gcg atg cgc aag ctg ctg gac cgc acg gac acc cgc 

Ala Pro He Gly Ala Met Arg Lys Leu Leu Asp Arg Thr Asp Thr Arg 

290 295 300 

ctt ggc gat tac gac ctg ttc gag gtg aac gag gca ttc gec gtc gtc 

Leu Gly Asp Tyr Asp Leu Phe Glu Val Asn Glu Ala Phe Ala Val Val 

305 310 315 320 

gee atg ate gcg atg aag gag ctt ggc ctg cca cac gat gec acg aac 

Ala Met He Ala Met Lys Glu Leu Gly Leu Pro His Asp Ala Thr Asn 

325 330 335 

ate aac ggc ggg gee tgc gcg ctt ggg cat ccc ate ggc gcg teg ggg 

He Asn Gly Gly Ala Cys Ala Leu Gly His Pro He Gly Ala Ser Gly 

340 345 350 



864 



ate gcg ctg gaa egg ctg age taa 
He Ala Leu Glu Arg Leu Ser 
385 390 



<210> 176 

<211> 391 

<212> PRT 

<213> Paracoccus sp . R1534 



<400> 176 

Met Asp Pro He Val lie Thr Gly Ala Met Arg Thr Pro Met Gly Ala 
1 5 10 15 

Phe Gin Gly Asp Leu Ala Ala Met Asp Ala Pro Thr Leu Gly Ala Ala 

20 25 30 



Ala He Arg Ala Ala Leu Asn Gly Leu Ser Pro Asp Met Val Asp Glu 
35 40 45 



912 



960 



1008 



1056 



gcg egg ate atg gtc acg ctg ctg aac gcg atg gcg gcg egg ggc gcg 1104 
Ala Arg He Met Val Thr Leu Leu Asn Ala Met Ala Ala Arg Gly Ala 
355 360 365 

acg cgc ggg gee gca tec gtc tgc ate ggc ggg ggc gag gcg acg gec 1152 
Thr Arg Gly Ala Ala Ser Val Cys He Gly Gly Gly Glu Ala Thr Ala 
370 " 375 380 



1176 



170 



WO 02/099095 PCT/EP02/06 1 7 1 



o 



Val Leu Met Gly Cys Val Leu Pro Ala Gly Gin Gly Gin Ala Pro Ala 
50 55 60 

Arg Gin Ala Ala Leu Asp Ala Gly Leu Pro Leu Ser Ala Gly Ala Thr 
65 7 0 75 80 

Thr He Asn Lys Met Cys Gly Ser Gly Met Lys Ala Ala Met Leu Gly 

85 90 95 

His Asp Leu He Ala Ala Gly Ser Ala Gly He Val Val Ala Gly Gly 

100 105 110 

Met Glu Ser Met Ser Asn Ala Pro Tyr Leu Leu Pro Lys Ala Arg Ser 
115 120 125 

Gly Met Arg Met Gly His Asp Arg Val Leu Asp His Met Phe Leu Asd 
130 135 140 

Gly Leu Glu Asp Ala Tyr Asp Lys Gly Arg Leu Met Gly Thr Phe Ala 
145 "0 155 160 

Glu Asp Cys Ala Gly Asp His Gly Phe Thr Arg Glu Ala Gin Asp Asp 

165 170 175 

Tyr Ala Leu Thr Ser Leu Ala Arg Ala Gin Asp Ala He Ala Ser Gly 

180 185 190 

Ala Phe Ala Ala Glu He Ala Pro Val Thr Val Thr Ala Arg Lys Val 
195 200 205 

Gin Thr Thr Val Asp Thr Asp Glu Met Pro Gly Lys Ala Arg Pro Glu 
210 215 220 

Lys He Pro His Leu Lys Pro Ala Phe Arg Asp Gly Gly Thr Val Thr 
225 230 235 240 

Ala Ala Asn Ser Ser Ser He Ser Asp Gly Ala Ala Ala Leu Val Met 

24 5 250 255 

Met Arg Gin Ser Gin Ala Glu Lys Leu Gly Leu Thr Pro He Ala Arg 

260 265 270 



He He Gly His Ala Thr His Ala Asp Arg Pro Gly Leu Phe 



Pro Thr 



171 
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275 280 285 

Ala Pro lie Gly Ala Met Arg Lys Leu Leu Asp Arg Thr Asp Thr Arg 
290 295 300 

Leu Gly Asp Tyr Asp Leu Phe Glu Val Asn Glu Ala Phe Ala Val Val 
305 310 315 320 

Ala Met lie Ala Met Lys Glu Leu Gly Leu Pro His Asp Ala Thr Asn 

325 330 335 

lie Asn Gly Gly Ala Cys Ala Leu Gly His Pro lie Gly Ala Ser Gly 

340 345 350 

Ala Arg lie Met Val Thr Leu Leu Asn Ala Met Ala Ala Arg Gly Ala 
355 360 365 

Thr Arg Gly Ala Ala Ser Val Cys lie Gly Gly Gly Glu Ala Thr Ala 
370 375 380 



lie Ala Leu Glu Arg Leu Ser 
385 390 



<210> 177 

<211> 1980 

<212> DNA 

<213> Paracoccus sp . R114 



<220> 

<221> CDS 

<222> (1) . . (1170) 

<223> phaA 



<220> 

<221> misc_f eature 
<222> (1179) . . (1194) 



n 



172 
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c 



c 



<223> inverted repeat between genes constituting a putative transcrioti 
onal sto 



<220> 

<221> misc_f eature 
<222> (1196) . . (1210) 



<223> inverted repeat between genes constituting a putative transcript! 
onal sto ^ 



<220> 

<221> CDS 

<222> (1258) . . (1980) 

<223> phaB 



<400> 177 

atg acc aaa gcc gta ate gta tct gec gca cgt acc ccc gtc ggc age 

Met Thr Lys Ala Val lie Val Ser Ala Ala Arg Thr Pro Val Gly Ser 
1 5 in 



15 



ggg ggc cag gag age atg teg ctg teg acc cat gcc gcc tat ctg cgc 
Gly Gly Gin Glu Ser Met Ser Leu Ser Thr His Ala Ala Tyr Leu Arg 



48 



ttc atg ggc gca ttc gcc aat gtc ccc gca cat gat ctg ggc gcc gcc 96 
Phe Met Gly Ala Phe Ala Asn Val Pro Ala His Asp Leu Gly Ala Ala 

20 25 30 

gtc ctg cgc gag gtc gtg gcc cgc gcc ggt gtc gac ccc gcc gag gtc 144 
Val Leu Arg Glu Val Val Ala Arg Ala Gly Val Asp Pro Ala Glu Val 
35 40 45 

age gag acg ate ctg ggc cag gtg ctg acc gcc gcg cag ggc cag aac 192 
Ser Glu Thr lie Leu Gly Gin Val Leu Thr Ala Ala Gin Gly Gin Asn 
50 55 60 

ccc gcg cgc cag gcg cat ate aat gcg ggc ctg ccc aag gaa teg gcg 
Pro Ala Arg Gin Ala His lie Asn Ala Gly Leu Pro Lys Glu Ser Ala 
65 70 75 so 

gcg tgg etc ate aac cag gtc tgc ggc teg ggg ctg cgc gcc gtc gcg 288 
Ala Trp Leu He Asn Gin Val Cys Gly Ser Gly Leu Arg Ala Val Ala 

85 90 95 

ctg gcg gcg cag cag gtc atg ctg ggc gat gcg cag ate gtt ctg gcg 336 
Leu Ala Ala Gin Gin Val Met Leu Gly Asp Ala Gin He Val Leu Ala 

100 105 HO 



240 



384 



173 
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115 120 125 

gcg ggc cag aag atg ggc gac atg aag atg ate gac acc atg ate cgc 

Ala Gly Gin Lys Met Gly Asp Met Lys Met lie Asp Thr Met He Arg 

130 135 140 

gac ggg ctg tgg gat gec ttc aac ggc tat cac atg ggt cag acc gec 

Asp Gly Leu Trp Asp Ala Phe Asn Gly Tyr His Met Gly Gin Thr Ala 

145 150 155 160 

gag aac gtg gec gac cag tgg teg ate age cgc gac cag cag gac gaa 

Glu Asn Val Ala Asp Gin Trp Ser He Ser Arg Asp Gin Gin Asp Glu 

165 170 175 



210 

gag ggc atg cag aag ctg cgc ccc gec 



Glu Gly Met Gin Lys Leu Arg Pro Ala Phe Thr Lys Glu Gly Ser Val 
225 230 235 240 

acg gcg ggc aac gcg teg ggc ctg aac gac ggc gcg gcg gee gtc atg 
Thr Ala Gly Asn Ala Ser Gly Leu Asn Asp Gly Ala Ala Ala Val Met 

245 250 255 



432 



480 



528 



ttc gee ctg get teg cag aac aag gee gag gee gcg cag aat gcg ggc 57 6 

Phe Ala Leu Ala Ser Gin Asn Lys Ala Glu Ala Ala Gin Asn Ala Gly 

180 185 190 

cgc ttc gat gac gaa ate gtc gee tat acc gtc aag ggc cgc aag ggc t>^4 

Arg Phe Asp Asp Glu He Val Ala Tyr Thr Val Lys Gly Arg Lys Gly 

195 200 205 

gac acg gtc gtc gac aag gac gaa tac ate cgc cac ggc gee acg ate 672 

Asp Thr Val Val Asp Lys Asp Glu Tyr He Arg His Gly Ala Thr He 

215 220 



ttc acc aag gaa ggc teg gtc 720 



768 



gtc atg tec gag gac gag gec gca cgc cgc ggg ctg acg ccg ctg gcg 816 
Val Met Ser Glu Asp Glu Ala Ala Arg Arg Gly Leu Thr Pro Leu Ala 

260 265 270 



912 



cgc ate gec tec tat gcg acg gcg ggc etc gac ccg gcg ate atg ggc 

Arg lie Ala Ser Tyr Ala Thr Ala Gly Leu Asp Pro Ala He Met Gly 

275 280 285 

acc ggg ccg ate ccc tec age cgc aag gcg ctg gaa aag gcg ggc tgg 

Thr Gly Pro He Pro Ser Ser Arg Lys Ala Leu Glu Lys Ala Gly Trp 
290 295 300 

teg gtc ggc gac ctg gac ctg gtc gag gcg aac gag gee ttt gec gcg 960 

Ser Val Gly Asp Leu Asp Leu Val Glu Ala Asn Glu Ala Phe Ala Ala 
305 310 315 320 

cag gee tgc gec gtg aac aag gac atg ggc tgg gat ccg tec ate gtg 1008 

Gin Ala Cys Ala Val Asn Lys Asp Met Gly Trp Asp Pro Ser He Val 

325 330 335 

aac gtc aac ggc ggc gcg ate gee ate ggc cac ccg ate ggc gee teg 1056 

Asn Val Asn Gly Gly Ala He Ala He Gly His Pro He Gly Ala Ser 

340 345 350 



< ■ 
864 v 
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ggg gcg egg ate ctg aac acc ctg ctg ttc gag atg cag cgc cgc gac 1104 
Gly Ala Arg lie Leu Asn Thr Leu Leu Phe Glu Met Gin Arg Arg Asp 
355 360 365 

gec aag aag ggc ctt gcg acg ctg tgc ate ggc ggc ggc atg ggc gtc 1152 
Ala Lys Lys Gly Leu Ala Thr Leu Cys He Gly Gly Gly Met Gly Val 
370 375 380 

gec atg tgc etc gaa cgc tgaacgaccg gcgtgtgcgc aatttaattg 1200 
Ala Met Cys Leu Glu Arg 
385 390 




cgcacacgcc ccctgcaaag tagcaatgtt ttacgataac gaatgaaggg gggaatc 1257 

atg tec aag gta gca ctg gtc acc ggc gga teg cgc ggc ate ggc gee 1305 
Met Ser Lys Val Ala Leu Val Thr Gly Gly Ser Arg Gly He Gly Ala 

395 400 405 

gag ate tgc aag gcg ctt cag gee gca ggc tat acc gtc gee gcg aac 1353 
Glu He Cys Lys Ala Leu Gin Ala Ala Gly Tyr Thr Val Ala Ala Asn 

410 415 420 

tat gec ggc aat gac gac gcg gee aag gee ttc acc gag gaa acc ggc 1401 
Tyr Ala Gly Asn Asp Asp Ala Ala Lys Ala Phe Thr Glu Glu Thr Gly 
425 430 435 

ate aag acc tac aag tgg teg gtc gee gat tac gat gec tgc aag gee 1449 
He Lys Thr Tyr Lys Trp Ser Val Ala Asp Tyr Asp Ala Cys Lys Ala 
440 445 450 

ggc ate gec cag gtc gaa gag gat ctg ggc ccg ate gec gtg ctg ate 1497 
Gly He Ala Gin Val Glu Glu Asp Leu Gly Pro He Ala Val Leu He 
455 460 465 470 

aac aat gee ggg ate acc cgc gac gcg ccc ttc cac aag atg acg ccc 1545 
Asn Asn Ala Gly He Thr Arg Asp Ala Pro Phe His Lys Met Thr Pro 

475 480 485 

gag aag tgg aag gag gtc ate gac acc aac ctg acc ggc acc ttc aac 1593 
Glu Lys Trp Lys Glu Val He Asp Thr Asn Leu Thr Gly Thr Phe Asn 

490 495 500 

atg acc cat ccg gtc tgg ccg ggc atg cgc gaa cgc aag ttc gga cgc 1641 
Met Thr His Pro Val Trp Pro Gly Met Arg Glu Arg Lys Phe Gly Arg 
505 510 515 

gtc ate aac ate age teg ate aac ggg cag aag ggc cag ttc ggg cag 1689 
Val He Asn He Ser Ser He Asn Gly Gin Lys Gly Gin Phe Gly Gin 
520 525 530 

gcg aac tat gee gcg gec aag gcg ggc gac ctg ggc ttc acc aag teg 1737 
Ala Asn Tyr Ala Ala Ala Lys Ala Gly Asp Leu Gly Phe Thr Lys Ser 
535 540 545 550 

ctg gcg cag gaa ggc gcg cgc aac aac ate acc gtc aac gcg ate tgc 1785 
Leu Ala Gin Glu Gly Ala Arg Asn Asn He Thr Val Asn Ala He Cys 
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555 560 565 

ccc ggc tat ate gcg acg gac atg gtg atg gec gtt ccc gaa cag gtc 
Pro Gly Tyr He Ala Thr Asp Met Val Met Ala Val Pro Glu Gin Val 

570 575 580 

cgc gag ggg ate ate gcg cag ate ccc gtc ggc cgc ttg ggc gag ccg 
Arg Glu Gly He He Ala Gin He Pro Val Gly Arg Leu Gly Glu Pro 
585 590 595 



ttc gtc aca ggc teg acc ate acg gcg aat ggc ggc cag tac tac ate 
Phe Val Thr Gly Ser Thr He Thr Ala Asn Gly Gly Gin Tyr Tyr lie 
615 620 625 630 

tga 



<210> 178 

<211> 390 

<212> PRT 

<213> Paracoccus sp . R114 



1833 



1881 



tec gag ate gcg cgc tgc gtg gtg ttc ctg gee tec gac gat gcg ggc 1929 
Ser Glu He Ala Arg Cys Val Val Phe Leu Ala Ser Asp Asp Ala Gly 
600 605 610 



1977 



1980 



<220> 

<221> misc_feature 
<222> (1179) . . (1194) 

<223> inverted repeat between genes constituting a putative transcripti 
onal sto 

<220> 

<2 2 1> mi s c_ f ea tur e 
<222> (1196) . . (1210) 

<223> inverted repeat between genes constituting a putative transcripti 
onal sto 

<400> 178 

Met Thr Lys Ala Val lie Val Ser Ala Ala Arg Thr Pro Val Gly Ser 
1 5 10 15 

Phe Met Gly Ala Phe Ala Asn Val Pro Ala His Asp Leu Gly Ala Ala 

20 25 30 
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Val Leu Arg Glu Val Val Ala Arg Ala Gly Val Asp Pro Ala Glu Val 
35 40 45 

Ser Glu Thr He Leu Gly Gin Val Leu Thr Ala Ala Gin Gly Gin Asn 

55 60 

Pro Ala Arg Gin Ala His lie Asn Ala Gly Leu Pro Lys Glu Ser Ala 
5 70 75 80 



Ala Trp Leu He Asn Gin Val Cys Gly Ser Gly Leu Arg Ala Val Ala 

95 



85 90 



Q Leu Ala Ala Gin Gin Val Met Leu Gly Asp Ala Gin He Val Leu Ala 

110 



100 105 



Gly Gly Gin Glu Ser Met Ser Leu Ser Thr His Ala Ala Tyr Leu Arg 
115 120 125 

Ala Gly Gin Lys Met Gly Asp Met Lys Met He Asp Thr Met He Arc 

130 135 14 £ xxe Ar » 

Asp Gly Leu Trp Asp Ala Phe Asn Gly Tyr His Met Gly Gin Thr Ala 



160 



Glu Asn Val Ala Asp Gin Trp Ser He Ser Arg Asp Gin Gin Asp Glu 

170 175 

C p he Ala Leu Ala Ser Gin Asn Lys Ala Glu Ala Ala Gin Asn Ala Gly 

185 190 

Arg Phe Asp Asp Glu lie Val Ala Tyr Thr Val Lys Gly Arg Lys Gly 



Asp Thr Val Val Asp Lys Asp Glu Tyr He Arg His Gly Ala Thr He 
^ 215 220 

Glu Gly Met Gin Lys Leu Arg Pro Ala Phe Thr Lys Glu Gly Ser Val 
225 230 235 240 

Thr Ala Gly Asn Ala Ser Gly Leu Asn Asp Gly Ala Ala Ala Val Met 

245 250 255 
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Val Met Ser Glu Asp Glu Ala Ala Arg Arg Gly Leu Thr Pro Leu Ala 

260 265 270 

Arg He Ala Ser Tyr Ala Thr Ala Gly Leu Asp Pro Ala He Met Gly 
275 280 285 

Thr Gly Pro He Pro Ser Ser Arg Lys Ala Leu Glu Lys Ala Gly Trp 
290 295 300 

Ser Val Gly Asp Leu Asp Leu Val Glu Ala Asn Glu Ala Phe Ala Ala 

305 310 315 320 

Gin Ala Cys Ala Val Asn Lys Asp Met Gly Trp Asp Pro Ser He Val /' 

325 330 335 v 

Asn Val Asn Gly Gly Ala He Ala He Gly His Pro He Gly Ala Ser 

340 345 350 

Gly Ala Arg He Leu Asn Thr Leu Leu Phe Glu Met Gin Arg Arg Asp 
355 360 365 

Ala Lys Lys Gly Leu Ala Thr Leu Cys lie Gly Gly Gly Met Gly Val 
370 375 380 

Ala Met Cys Leu Glu Arg 
385 390 



<210> 179 

<211> 240 

<212> PRT 

<213> Paracoccus sp. R114 



<220> 

<221> misc_feature 
<222> (1179) . . (1194) 



<223> inverted repeat between genes constituting a putative transcripti 
onal sto 



<220> 
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c 



<221> misc_feature 
<222> (1196) . . (1210) 

<223> inverted repeat between genes constituting a putative transcripti 
onal sto 

<400> 179 

Met Ser Lys Val Ala Leu Val Thr Gly Gly Ser Arg Gly lie Gly Ala 
15 10 15 

Glu lie Cys Lys Ala Leu Gin Ala Ala Gly Tyr Thr Val Ala Ala Asn 

20 25 30 



Tyr Ala Gly Asn Asp Asp Ala Ala Lys Ala Phe Thr Glu Glu Thr Gly 
35 40 45 



lie Lys Thr Tyr Lys Trp Ser Val Ala Asp Tyr Asp Ala Cys Lys Ala 
50 55 60 



Gly lie Ala Gin Val Glu Glu Asp Leu Gly Pro lie Ala Val Leu lie 
65 70 75 80 



Asn Asn Ala Gly lie Thr Arg Asp Ala Pro Phe His Lys Met Thr Pro 

85 90 95 



Glu Lys Trp Lys Glu Val lie Asp Thr Asn Leu Thr Gly Thr Phe Asn 

100 105 110 



Met Thr His Pro Val Trp Pro Gly Met Arg Glu Arg Lys Phe Gly Arg 
115 120 125 



Val lie Asn lie Ser Ser lie Asn Gly Gin Lys Gly Gin Phe Gly Gin 
130 135 140 



Ala Asn Tyr Ala Ala Ala Lys Ala Gly Asp Leu Gly Phe Thr Lys Ser 
145 150 155 160 



Leu Ala Gin Glu Gly Ala Arg Asn Asn lie Thr Val Asn Ala lie Cys 

165 170 175 



Pro Gly Tyr lie Ala Thr Asp Met Val Met Ala Val Pro Glu Gin Val 

180 185 190 
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Arg Glu Gly He He Ala Gin He Pro Val Gly Arg Leu Gly Glu Pro 
!95 200 205 



Ser Glu He Ala Arg Cys Val Val Phe Leu Ala Ser Asp Asp Ala Gly 
210 215 220 

Phe Val Thr Gly Ser Thr He Thr Ala Asn Gly Gly Gin Tyr Tyr He 
225 230 235 240 

<210> 180 

<211> 729 

<212> DNA 

<213> Paracoccus carotinif aciens E-396 



<220> 

<221> CDS 

<222> (1) - . (726) 

<223> Beta-carotene Beta-4 oxygenase 



<400> 180 

atg age gca cat gec ctg ccc aag gca 
Met Ser Ala His Ala Leu Pro Lys Ala 
1 5 

ate gtc teg ggc ggc ate ate gee gcg 
He Val Ser Gly Gly He He Ala Ala 

20 25 

gcg ctg tgg ttt ctg gac gcg gcg gcg 
Ala Leu Trp Phe Leu Asp Ala Ala Ala 
35 40 

aat ttc ctg ggg ctg acc tgg ctg teg 
Asn Phe Leu Gly Leu Thr Trp Leu Ser 
50 55 

cat gac gcg atg cat ggg teg gtc gtg 
His Asp Ala Met His Gly Ser Val Val 
65 ^O 

gcg gcg atg ggc cag ctt gtc ctg tgg 
Ala Ala Met Gly Gin Leu Val Leu Trp 

85 



gat ctg acc gee acc agt ttg 48 
Asp Leu Thr Ala Thr Ser Leu 
10 15 

tgg ctg gec ctg cat gtg cat 96 
Trp Leu Ala Leu His Val His 

30 

cat ccc ate ctg gcg gtc gcg 144 
His Pro He Leu Ala Val Ala 

45 

gtc ggt ctg ttc ate ate gcg 192 
Val Gly Leu Phe He He Ala 
60 

ccg ggg cgc ccg cgc gec aat 24 0 

Pro Gly Arg Pro Arg Ala Asn 
75 80 

ctg tat gee gga ttt tec tgg 288 
Leu Tyr Ala Gly Phe Ser Trp 
90 95 
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cgc aag atg ate gtc aag cac atg gec cat cat cgc cat gec gga acc 336 
Arg Lys Met He Val Lys His Met Ala His His Arg His Ala Gly Thr 

100 105 no 

gac gac gac cca gat ttc gac cat ggc ggc ccg gtc cgc tgg tac gec 384 
Asp Asp Asp Pro Asp Phe Asp His Gly Gly Pro Val Arg Trp Tyr Ala 
115 120 12 5 



cgc ttc ate ggc acc tat ttc ggc tgg cgc gag ggg ctg ctg ccg ccc 
Arg Phe He Gly Thr Tyr Phe Gly Trp Arg Glu Gly Leu Leu Leu Pro 
"0 135 140 

gtc ate gtg acg gtc tat gcg ctg atg ttg ggg gat cgc tgg atg tac 
Val He Val Thr Val Tyr Ala Leu Met Leu Gly Asp Arg Trp Met Tvr 
145 150 155 160 

gtg gtc ttc tgg ccg ttg ccg teg ate ctg gcg teg ate cag ctg ttc 
Val Val Phe Trp Pro Leu Pro Ser He Leu Ala Ser He Gin Leu Phe 

165 170 175 

gtg ttc ggc ate tgg ctg ccg cac cgc ccc ggc cac gac gcg ttc ccg 
Val Phe Gly lie Trp Leu Pro His Arg Pro Gly His Asp Ala Phe Pro 

180 185 190 

gac cgc cac aat gcg egg teg teg egg ate age gac ccc gtg teg ctg 
Asp Arg His Asn Ala Arg Ser Ser Arg He Ser Asp Pro Val Ser Leu 
195 200 205 

ctg acc tgc ttt cac ttt ggc ggt tat cat cac gaa cac cac ctg cac 
Leu Thr Cys Phe His Phe Gly Gly Tyr His His Glu His His Leu His 
21° 215 220 

ccg acg gtg cct tgg tgg cgc ctg ccc age acc cgc acc aag ggg gac 
Pro Thr Val Pro Trp Trp Arg Leu Pro Ser Thr Arg Thr Lys Gly Asp 
225 230 235 240 

acc gca tga 
Thr Ala 



<210> 181 

<211> 242 

<212> PRT 

<213> Paracoccus carotinif aciens E-396 



432 



480 



528 



576 



624 



672 



720 



729 



<400> 181 



Met Ser Ala His Ala Leu Pro Lys Ala Asp Leu Thr Ala Thr Ser Leu 
15 10 15 
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He Val Ser Gly Gly He He Ala Ala Trp Leu Ala Leu His Val His 

20 25 30 

Ala Leu Trp Phe Leu Asp Ala Ala Ala His Pro He Leu Ala Val Ala 
35 40 45 

Asn Phe Leu Gly Leu Thr Trp Leu Ser Val Gly Leu Phe He He Ala 
50 55 60 

His Asp Ala Met His Gly Ser Val Val Pro Gly Arg Pro Arg Ala Asn 
65 70 75 80 

Ala Ala Met Gly Gin Leu Val Leu Trp Leu Tyr Ala Gly Phe Ser Trp 

85 90 95 

Arg Lys Met lie Val Lys His Met Ala His His Arg His Ala Gly Thr 

100 105 110 

Asp Asp Asp Pro Asp Phe Asp His Gly Gly Pro Val Arg Trp Tyr Ala 
115 120 125 

Arg Phe He Gly Thr Tyr Phe Gly Trp Arg Glu Gly Leu Leu Leu Pro 
130 135 140 

Val He Val Thr Val Tyr Ala Leu Met Leu Gly Asp Arg Trp Met Tyr 
145 150 155 160 

Val Val Phe Trp Pro Leu Pro Ser He Leu Ala Ser lie Gin Leu Phe 

165 170 175 

Val Phe Gly He Trp Leu Pro His Arg Pro Gly His Asp Ala Phe Pro 

180 185 190 

Asp Arg His Asn Ala Arg Ser Ser Arg He Ser Asp Pro Val Ser Leu 
!95 200 205 

Leu Thr Cys Phe His Phe Gly Gly Tyr His His Glu His His Leu His 
2io 215 220 

Pro Thr Val Pro Trp Trp Arg Leu Pro Ser Thr Arg Thr Lys Gly Asp 
225 230 235 240 
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Thr Ala 




<210> 182 

<211> 510 

<212> DNA 

<213> Paracoccus sp. R1534 



<220> 

<221> CDS 

<222> (1) . . (507) 

<223> Beta-Carotene hydroxylase 



<400> 182 

atg age act tgg gec gca ate ctg ace gtc ate ctg ace gtc gec gcg 48 

Met Ser Thr Trp Ala Ala He Leu Thr Val He Leu Thr Val Ala Ala 
1 5 10 15 

atg gag ctg acg gee tac tec gtc cat egg tgg ate atg cat ggc ccc 9 6 

Met Glu Leu Thr Ala Tyr Ser Val His Arg Trp He Met His Gly Pro 

20 25 30 

ctg ggc tgg ggc tgg cat aaa teg cac cac gac gag gat cac gac cac 144 
Leu Gly Trp Gly Trp His Lys Ser His His Asp Glu Asp His Asp His 
35 40 45 

gcg etc gag aag aac gac etc tat ggc gtc ate ttc gcg gta ate teg 192 
Ala Leu Glu Lys Asn Asp Leu Tyr Gly Val He Phe Ala Val He Ser 
50 55 60 

ate gtg ctg ttc gcg ate ggc gcg atg ggg teg gat ctg gec tgg tgg 240 
He Val Leu Phe Ala He Gly Ala Met Gly Ser Asp Leu Ala Trp Trp 
65 70 75 80 

ctg gcg gtg ggg gtc acc tgc tac ggg ctg ate tac tat ttc ctg cat 288 
Leu Ala Val Gly Val Thr Cys Tyr Gly Leu He Tyr Tyr Phe Leu His 

85 90 95 

gac ggc ttg gtg cat ggg cgc tgg ccg ttc cgc tat gtc ccc aag cgc 336 
Asp Gly Leu Val His Gly Arg Trp Pro Phe Arg Tyr Val Pro Lys Arg 

100 105 110 

ggc tat ctt cgt cgc gtc tac cag gca cac agg atg cat cac gcg gtc 384 
Gly Tyr Leu Arg Arg Val Tyr Gin Ala His Arg Met His His Ala Val 
115 120 125 
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cat ggc cgc gag aac tgc gtc age ttc ggt ttc ate tgg gcg ccc teg 
His Gly Arg Glu Asn Cys Val Ser Phe Gly Phe lie Trp Ala Pro Ser 
130 135 140 

gtc gac age etc aag gca gag ctg aaa cgc teg ggc gcg ctg ctg aag 
w=i ien <^t i,p,i Lvs Ala Glu Leu Lys Arg Ser Gly Ala Leu Leu Lys 

155 160 



145 


150 


gac cgc 
Asp Arg 


gaa ggg gcg gat cgc 
Glu Gly Ala Asp Arg 
165 


<210> 


183 


<211> 


169 


<212> 


PRT 


<213> 


Paracoccus sp. R1534 



<400> 183 

Met Ser Thr Trp Ala Ala He Leu Thr Val lie Leu Thr Val Ala Ala 
1 5 10 15 



Met Glu Leu Thr Ala Tyr Ser Val His Arg Trp He Met His Gly Pro 

20 25 30 

Leu Gly Trp Gly Trp His Lys Ser His His Asp Glu Asp His Asp His 
35 40 45 

Ala Leu Glu Lys Asn Asp Leu Tyr Gly Val He Phe Ala Val He Ser 
50 55 60 

He Val Leu Phe Ala He Gly Ala Met Gly Ser Asp Leu Ala Trp Trp 
65 70 75 80 

Leu Ala Val Gly Val Thr Cys Tyr Gly Leu He Tyr Tyr Phe Leu His 

85 90 95 

Asp Gly Leu Val His Gly Arg Trp Pro Phe Arg Tyr Val Pro Lys Arg 

100 105 HO 

Gly Tyr Leu Arg Arg Val Tyr Gin Ala His Arg Met His His Ala Val 
115 120 125 



184 



432 



480 



510 



C) 
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His Gly Arg Glu Asn Cys Val Ser Phe Gly Phe lie Trp Ala Pro Ser 
130 135 140 



Val Asp Ser Leu Lys Ala Glu Leu Lys Arg Ser Gly Ala Leu Leu Lys 
I 45 150 155 160 



Asp Arg Glu Gly Ala Asp Arg Asn Thr 

165 



<210> 184 

<211> 888 

<212> DNA 

<213> Paracoccus sp. R1534 



<220> 

<221> CDS 

<222> <1)..(885) 

<223> farnesyl transferase or geranylgeranyl diphosphate synthase 



<400> 184 

atg acg ccc aag cag caa ttc ccc eta cgc gat ctg gtc gag ate agg 

Met Thr Pro Lys Gin Gin Phe Pro Leu Arg Asp Leu Val Glu lie Arg 
15 10 15 

ctg gcg cag ate teg ggc cag ttc ggc gtg gtc teg gec ccg etc ggc 
Leu Ala Gin lie Ser Gly Gin Phe Gly Val Val Ser Ala Pro Leu Gly 

20 25 30 



48 



96 



gcg gec atg age gat gee gee ctg tec ccc ggc aaa cgc ttt cgc gee 144 
Ala Ala Met Ser Asp Ala Ala Leu Ser Pro Gly Lys Arg Phe Arg Ala 
35 40 45 

gtg ctg atg ctg atg gtc gec gaa age teg ggc ggg gtc tgc gat gcg 192 
Val Leu Met Leu Met Val Ala Glu Ser Ser Gly Gly Val Cys Asp Ala 
50 55 60 

atg gtc gat gee gee tgc gcg gtc gag atg gtc cat gee gca teg ctg 240 
Met Val Asp Ala Ala Cys Ala Val Glu Met Val His Ala Ala Ser Leu 
65 70 75 80 

ate ttc gac gac atg ccc tgc atg gac gat gec agg ace cgt cgc ggt 288 
lie Phe Asp Asp Met Pro Cys Met Asp Asp Ala Arg Thr Arg Arg Gly 

85 90 95 
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cag ccc gcc acc cat gtc gcc cat ggc gag ggg cgc gcg gtg ctt gcg 
Gin Pro Ala Thr His Val Ala His Gly Glu Gly Arg Ala Val Leu Ala 

100 105 HO 

ggc ate gcc ctg ate acc gag gcc atg egg att ttg ggc gag gcg cgc 
Gly He Ala Leu He Thr Glu Ala Met Arg He Leu Gly Glu Ala Arg 
115 120 125 

ggc gcg acg ccg gat cag cgc gca agg ctg gtc gca tec atg teg cgc 
Gly Ala Thr Pro Asp Gin Arg Ala Arg Leu Val Ala Ser Met Ser Arg 
130 135 140 

gcg atg gga ccg gtg ggg ctg tgc gca ggg cag gat ctg gac ctg cac 
Ala Met Gly Pro Val Gly Leu Cys Ala Gly Gin Asp Leu Asp Leu His 
145 150 155 160 

gcc ccc aag gac gcc gcc ggg ate gaa cgt gaa cag gac etc aag acc 
Ala Pro Lys Asp Ala Ala Gly He Glu Arg Glu Gin Asp Leu Lys Thr 

165 170 175 

ggc gtg ctg ttc gtc gcg ggc etc gag atg ctg tec att att aag ggt 
Glv Val Leu Phe Val Ala Gly Leu Glu Met Leu Ser He He Lys Gly 

180 185 190 

ctg gac aag gcc gag acc gag cag etc atg gcc ttc ggg cgt cag ctt 
Leu Asp Lys Ala Glu Thr Glu Gin Leu Met Ala Phe Gly Arg Gin Leu 
19 5 200 205 

ggt egg gtc ttc cag tec tat gac gac ctg ctg gac gtg ate ggc gac 
Gly Arg Val Phe Gin Ser Tyr Asp Asp Leu Leu Asp Val He Gly Asp 
2io 215 220 

aag gcc age acc ggc aag gat acg ggg cgc gac acc gcc gcc ccc ggc 
Lvs Ala Ser Thr Gly Lys Asp Thr Gly Arg Asp Thr Ala Ala Pro Gly 
225 230 235 240 

cca aag cgc ggc ctg atg gcg gtc gga cag atg ggc gac gtg gcg cag 
Pro Lys Arg Gly Leu Met Ala Val Gly Gin Met Gly Asp Val Ala Gin 

245 250 255 

cat tac cgc gcc age cgc gcg caa ctg gac gag ctg atg cgc acc egg 
His Tyr Arg Ala Ser Arg Ala Gin Leu Asp Glu Leu Met Arg Thr Arg 

260 265 270 

ctg ttc cgc ggg ggg cag ate gcg gac ctg ctg gcc cgc gtg ctg ccg 
Leu Phe Arg Gly Gly Gin He Ala Asp Leu Leu Ala Arg Val Leu Pro 
275 280 285 

cat gac ate cgc cgc age gcc tag 
His Asp He Arg Arg Ser Ala 
290 295 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



888 



<210> 185 



<211> 295 
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G 



<212> PRT 

<213> Paracoccus sp. R1534 



<400> 185 

Met Thr Pro Lys Gin Gin Phe Pro Leu Arg Asp Leu Val Glu lie Arg 
1 5 10 15 

Leu Ala Gin lie Ser Gly Gin Phe Gly Val Val Ser Ala Pro Leu Glv 

20 25 30 

Ala Ala Met Ser Asp Ala Ala Leu Ser Pro Gly Lys Arg Phe Arg Ala 
35 40 45 

Val Leu Met Leu Met Val Ala Glu Ser Ser Gly Gly Val Cys Asp Ala 
50 55 60 

Met Val Asp Ala Ala Cys Ala Val Glu Met Val His Ala Ala Ser Leu 
65 70 75 80 

He Phe Asp Asp Met Pro Cys Met Asp Asp Ala Arg Thr Arg Ara Glv 

85 90 95 

Gin Pro Ala Thr His Val Ala His Gly Glu Gly Arg Ala Val Leu Ala 

100 105 HO 

Gly He Ala Leu He Thr Glu Ala Met Arg He Leu Gly Glu Ala Ara 
H5 120 125 

Gly Ala Thr Pro Asp Gin Arg Ala Arg Leu Val Ala Ser Met Ser Arg 
130 135 140 

Ala Met Gly Pro Val Gly Leu Cys Ala Gly Gin Asp Leu Asp Leu His 
145 150 155 160 

Ala Pro Lys Asp Ala Ala Gly He Glu Arg Glu Gin Asp Leu Lys Thr 

165 170 175 

Gly Val Leu Phe Val Ala Gly Leu Glu Met Leu Ser He He Lys Glv 

180 185 190 

Leu Asp Lys Ala Glu Thr Glu Gin Leu Met Ala Phe Gly Arg Gin Leu 



187 



BNSDOCID: <WO 0209909 5A2J_> 



WO 02/099095 



PCT/EP02/06171 



195 200 205 

Gly Arg Val Phe Gin Ser Tyr Asp Asp Leu Leu Asp Val He Gly Asp 
210 215 220 

Lys Ala Ser Thr Gly Lys Asp Thr Gly Arg Asp Thr Ala Ala Pro Gly 
225 230 235 240 

Pro Lys Arg Gly Leu Met Ala Val Gly Gin Met Gly Asp Val Ala Gin 

245 250 255 



His Tyr Arg Ala Ser Arg Ala Gin Leu Asp Glu Leu Met Arg Thr Arg 

260 265 270 



Leu Phe Arg Gly Gly Gin He Ala Asp Leu Leu Ala Arg Val Leu Pro 
275 280 285 



His Asp He Arg Arg Ser Ala 
290 295 



<210> 186 

<211> 30 

<212> DNA 

<213> synthetic construct 



<400> 186 

aaggcctcat atgagcgcac atgccctgcc 



<210> 187 

<211> 28 

<212> DNA 

<213> synthetic construct 



<400> 187 

cgggatcctc atgcggtgtc ccccttgg 



<210> 188 



30 



28 



188 
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<211> 30 
<212> DNA 

<213> synthetic construct 



<400> 188 

aaggcctcat atgagcactt gggccgcaat 30 

<210> 189 

<211> 30 

<212> DNA 

<213> synthetic construct 



<400> 189 

aggatcctca tgtattgcga tccgcccctt 30 

<210> 190 

<211> 52 

<212> DNA 

<213> synthetic construct 



<400> 190 

^ gtgcagcctc aggtcgacat atgcggccgc atccggatcc ctcctcctcc ag 52 

<210> 191 

<211> 52 

<212> DNA 

<213> synthetic construct 



<400> 191 

cacgtcggag tccagctgta tacgccggcg taggcctagg gaggaggagg tc 52 

<210> 192 
<211> 52 



189 
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<212> DNA 

<213> synthetic construct 



<400> 192 w co 

gtgcaggagg aggtcgacat atgcggccgc atccggatcc ctgaggctcc ag 

<210> 193 

<211> 52 

<212> DNA 

<213> synthetic construct 



{ ' 



<400> 193 _ co 

cacgtcctcc tccagctgta tacgccggcg taggcctagg gactccgagg tc 

<210> 194 

<211> 52 

<212> DNA 

<213> synthetic construct 



<400> 194 co 
ctggagcctc aggtcgacat atgcggccgc atccggatcc ctcctcctgc ac 



<210> 195 

<211> 52 

<212> DNA 

<213> synthetic construct 



<400> 195 . co 

gacctcggag tccagctgta tacgccggcg taggcctagg gaggaggacg tg ^ 

<210> 196 
<211> 52 



190 
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<212> DNA 

<213> synthetic construct 
<400> 196 

ctggaggagg aggtcgacat atgcggccgc atccggatcc ctgaggctgc ac 52 

<210> 197 

<211> 52 

<212> DNA 

<213> synthetic construct 

o 

<400> 197 

gacctcctcc tccagctgta tacgccggcg taggcctagg gactccgacg tg 52. 
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BUDAPEST TREATY ON THE ^^TION^ R^G^ON OF 
THE DEPOSIT OF MICROORGANISMS FOR:THE PURPOSES OF PATENT PROCEDURE 

| : ; i i 

; JNfERNAJl6NAL FORM 

RECEIPT IN THE CASE OF AN ORIGINAL ^EPOSIT ISSUED PURSUANT TO RULE 7.3 
AND^IbRUTY STATEMENT; ISpUED PURSUANT TO RULE 10.2 

1 

To; (Name and Address of Depositor qr Attorney) 

i 

Roche Vitamins Inc 
Attn: Markus Hujmbetin 
340 Kingsland Street 
NuUey t NJ0071ip-n9O 

' i 

Deposited on Behalf of; Koche Vitamins Inc. : 

Identification Refer*** by Depositor.' ™ent Depoiit Designation 

i 
I 

Paracoccui sp.: R-I506 j PTA-3431 

The deposit was aewmpa^ by: - a scientif ic descriptfon _ a proposed taxooomic description indicated above. 

The deposit was received June 5. 2001 by this International Depository Authority and has been accepted 

j | j 

AT YOUR REQUEST: j 2L We will inform you ofi requests for the strain for 30 years. 

The strain will be made available if a patent office sigratoW to Ihe Budapest Treaty certifies one's right to receive, 
o^tf? 35 Cblri citing the straio, and ATCC fa instructed by the United States Patent A Trademark 
Office or the depositor to ttlease said strain. ! 

If the culture should die or be destroyed during the etTecufe term of the deposit, it shall be your responsibility to 
replace it with living culture of the same. j j 

The strain will be maintained for a period of at least 30, years from date of deposit, or five years after the most 
^SSS^S a^mple. whichever i, longer. The Unfed States and many other countries are s.gnato-y to the 

Budapest Treaty. I j j 
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said polypeptides comprising SEQ.ID.No. 24 and their 
use. 



1.2. Claim : 14 

A method of making a carotenoid-producing cell 

1.3. Claim : 16 

A microorganism of the genus Paracoccus 
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An isolated polypeptide having the aminoacid sequence shown 
in SEQ.ID.No. 159; a polynucleotide encoding said 
polypeptide comprising SEQ.ID.No. 157 and their use. 
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An isolated polypeptide having the aminoacid sequence shown 
in SEQ.ID.No. 160; a polynucleotide encoding said 
polypeptide comprising SEQ.ID.No. 157 and their use. 



4. Claims: 4, 5, 8 completely and 9-13 partially 

An isolated polypeptide having the aminoacid sequence shown 
in SEQ.ID.No. 178 resp. 179; a polynucleotide encoding said 
polypeptides comprising SEQ.ID.No. 177 and their use. 
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